Re: Libpq support to connect to standby server as priority
Hi All,
Recently I put a proposal to support 'prefer-read' parameter in
target_session_attrs in libpq. Now I updated the patch with adding content
in the sgml and regression test case.
Some people may have noticed there is already another patch (
https://commitfest.postgresql.org/15/1148/ ) which looks similar with this.
But I would say this patch is more complex than my proposal.
It is better separate these 2 patches to consider.
Regards,
Jing Wang
Fujitsu Australia
Attachments:
libpq_support_perfer-read_002.patchapplication/octet-stream; name=libpq_support_perfer-read_002.patchDownload
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 02884ba..d263aab 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1500,13 +1500,23 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
+ is considered acceptable. The query
<literal>SHOW transaction_read_only</literal> will be sent upon any
successful connection; if it returns <literal>on</literal>, the connection
will be closed. If multiple hosts were specified in the connection
string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
+ attempt had failed.
+ </para>
+ <para>
+ If this paramete is set to <literal>prefer_read</literal>
+ the libpq will try to connect to a read-only transactions supported server
+ firstly. If failed to connect to a read-only transactions supported server
+ then the libpq will try to connect to a read-write transactions supported
+ server.
+ </para>
+ <para>
+ The default value of this parameter,<literal>any</literal>, regards all
+ connections as acceptable.
</para>
</listitem>
</varlistentry>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 77eebb0..3fd4c0f 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -327,7 +327,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1184,7 +1184,8 @@ connectOptions2(PGconn *conn)
if (conn->target_session_attrs)
{
if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ && strcmp(conn->target_session_attrs, "read-write") != 0
+ && strcmp(conn->target_session_attrs, "prefer-read") != 0)
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2086,8 +2087,22 @@ keep_going: /* We will come back to here until there is
{
if (++conn->whichhost >= conn->nconnhost)
{
- conn->whichhost = 0;
- break;
+ if (conn->primary_host_index > 0 &&
+ strcmp(conn->target_session_attrs,"prefer-read") ==0 )
+ {
+ /*
+ * Go to here means failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ * Only under the 'prefer-read' scenario will go to here.
+ */
+ conn->addr_cur = conn->connhost[conn->primary_host_index].addrlist;
+ conn->whichhost = conn->primary_host_index;
+ }
+ else
+ {
+ conn->whichhost = 0;
+ break;
+ }
}
conn->addr_cur =
conn->connhost[conn->whichhost].addrlist;
@@ -2341,6 +2356,14 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_NEEDED;
goto keep_going;
}
+ else if (conn->primary_host_index >= 0 &&
+ strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ {
+ conn->addr_cur = conn->connhost[conn->primary_host_index].addrlist;
+ conn->whichhost = conn->primary_host_index;
+ conn->status = CONNECTION_NEEDED;
+ goto keep_going;
+ }
goto error_return;
}
@@ -2978,10 +3001,12 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required,
+ * see if we have one.
*/
if (conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
/*
* We are yet to make a connection. Save all existing
@@ -3042,10 +3067,12 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is requested check for same.
+ * If a read-write or prefer-read connection is requested
+ * check for same.
*/
if (conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3124,57 +3151,130 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server = false;
val = PQgetvalue(res, 0, 0);
if (strncmp(val, "on", 2) == 0)
+ readonly_server = true;
+
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
{
- const char *displayed_host;
- const char *displayed_port;
+ if(readonly_server)
+ {
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
- PQclear(res);
- restoreErrorMessage(conn, &savedMessage);
+ /* Not writable; close connection. */
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+ pqDropConnection(conn, true);
- /* Not writable; close connection. */
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
- pqDropConnection(conn, true);
+ /* Skip any remaining addresses for this host. */
+ conn->addr_cur = NULL;
+ if (conn->whichhost + 1 < conn->nconnhost)
+ {
+ conn->status = CONNECTION_NEEDED;
+ goto keep_going;
+ }
- /* Skip any remaining addresses for this host. */
- conn->addr_cur = NULL;
- if (conn->whichhost + 1 < conn->nconnhost)
+ /* No more addresses to try. So we fail. */
+ goto error_return;
+ }
+ else /* server support read-write */
{
- conn->status = CONNECTION_NEEDED;
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
goto keep_going;
}
-
- /* No more addresses to try. So we fail. */
- goto error_return;
}
- PQclear(res);
- termPQExpBuffer(&savedMessage);
+ else /* conn->target_session_attrs is prefer-read */
+ {
+ if(readonly_server)
+ {
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* We can release the address lists now. */
- release_all_addrinfo(conn);
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
- /*
- * Finish reading any remaining messages before being
- * considered as ready.
- */
- conn->status = CONNECTION_CONSUME;
- goto keep_going;
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+ else /* server support read-write */
+ {
+ if ((conn->primary_host_index < 0) && (conn->whichhost + 1 < conn->nconnhost))
+ {
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /*
+ * Connecting to a writable server, close it
+ * and try to connect to another one.
+ */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+ pqDropConnection(conn, true);
+
+ /* Skip any remaining addresses for this host. */
+ conn->addr_cur = NULL;
+
+ conn->status = CONNECTION_NEEDED;
+
+ /* Record primary host index */
+ conn->primary_host_index = conn->whichhost;
+ goto keep_going;
+ }
+ else /* No more host to connect, keep this connection */
+ {
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+ }
+ }
}
/*
@@ -3393,6 +3493,8 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ conn->primary_host_index = -1;
+
return conn;
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4e35409..a3e582d 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -362,7 +362,7 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
/* Optional file to write trace info to */
@@ -396,6 +396,7 @@ struct pg_conn
int nconnhost; /* # of possible hosts */
int whichhost; /* host we're currently considering */
pg_conn_host *connhost; /* details about each possible host */
+ int primary_host_index; /* index for primary host in connhost */
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index fb27925..09816e0 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 28;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -115,6 +115,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
On Wed, Jan 24, 2018 at 9:01 AM Jing Wang <jingwangian@gmail.com> wrote:
Hi All,
Recently I put a proposal to support 'prefer-read' parameter in
target_session_attrs in libpq. Now I updated the patch with adding content
in the sgml and regression test case.Some people may have noticed there is already another patch (
https://commitfest.postgresql.org/15/1148/ ) which looks similar with
this. But I would say this patch is more complex than my proposal.It is better separate these 2 patches to consider.
I also feel prefer-read and read-only options needs to take as two
different options.
prefer-read is simple to support than read-only.
Here I attached an updated patch that is rebased to the latest master and
also
fixed some of the corner scenarios.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Allow-target-session-attrs-to-accept-prefer-read-opti.patchapplication/octet-stream; name=0001-Allow-target-session-attrs-to-accept-prefer-read-opti.patchDownload
From 0ac2c799bd6bc75ba039520dcf8474be52688e7b Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 7 Jun 2018 18:11:51 +1000
Subject: [PATCH] Allow target-session-attrs to accept prefer-read option
With this new prefer-read option, the connection is preferred
to connect to a read-only server if available in the connection
string, otherwise connect to a read-write server.
---
doc/src/sgml/libpq.sgml | 13 ++-
src/interfaces/libpq/fe-connect.c | 208 ++++++++++++++++++++++++++--------
src/interfaces/libpq/libpq-int.h | 3 +-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
4 files changed, 189 insertions(+), 49 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 498b8df988..05562e1ec0 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1592,8 +1592,17 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
successful connection; if it returns <literal>on</literal>, the connection
will be closed. If multiple hosts were specified in the connection
string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
+ attempt had failed.
+ </para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, a connection
+ in which read-only transactions are supported server is preferred. If
+ failed to connect to a read-only transactions supported server, then
+ a connection to the read-write transactions supported server is accepted.
+ </para>
+ <para>
+ The default value of this parameter,<literal>any</literal>, regards all
+ connections as acceptable.
</para>
</listitem>
</varlistentry>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a7e969d7c1..9b16d57e0d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -327,7 +327,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1184,7 +1184,8 @@ connectOptions2(PGconn *conn)
if (conn->target_session_attrs)
{
if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ && strcmp(conn->target_session_attrs, "read-write") != 0
+ && strcmp(conn->target_session_attrs, "prefer-read") != 0)
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2086,8 +2087,23 @@ keep_going: /* We will come back to here until there is
{
if (++conn->whichhost >= conn->nconnhost)
{
- conn->whichhost = 0;
- break;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Go to here means failed to connect to
+ * read-only servers and now try connect to
+ * read-write server again. Only under the
+ * 'prefer-read' scenario will go to here.
+ */
+ conn->addr_cur = conn->connhost[conn->read_write_host_index].addrlist;
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -1;
+ }
+ else
+ {
+ conn->whichhost = 0;
+ break;
+ }
}
conn->addr_cur =
conn->connhost[conn->whichhost].addrlist;
@@ -2112,6 +2128,14 @@ keep_going: /* We will come back to here until there is
conn->addr_cur = addr_cur->ai_next;
continue;
}
+ else if (conn->read_write_host_index >= 0)
+ {
+ conn->addr_cur = conn->connhost[conn->read_write_host_index].addrlist;
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -1;
+ continue;
+ }
+
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
@@ -2341,6 +2365,14 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_NEEDED;
goto keep_going;
}
+ else if (conn->read_write_host_index >= 0)
+ {
+ conn->addr_cur = conn->connhost[conn->read_write_host_index].addrlist;
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -1;
+ conn->status = CONNECTION_NEEDED;
+ goto keep_going;
+ }
goto error_return;
}
@@ -2978,10 +3010,12 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*/
if (conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
/*
* We are yet to make a connection. Save all existing
@@ -3042,10 +3076,12 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is requested check for same.
+ * If a read-write or prefer-read connection is requested check
+ * for same.
*/
if (conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3128,53 +3164,133 @@ keep_going: /* We will come back to here until there is
val = PQgetvalue(res, 0, 0);
if (strncmp(val, "on", 2) == 0)
{
- const char *displayed_host;
- const char *displayed_port;
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ {
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
- PQclear(res);
- restoreErrorMessage(conn, &savedMessage);
+ /* Not writable; close connection. */
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+ pqDropConnection(conn, true);
- /* Not writable; close connection. */
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
- pqDropConnection(conn, true);
+ /* Skip any remaining addresses for this host. */
+ conn->addr_cur = NULL;
+ if (conn->whichhost + 1 < conn->nconnhost)
+ {
+ conn->status = CONNECTION_NEEDED;
+ goto keep_going;
+ }
- /* Skip any remaining addresses for this host. */
- conn->addr_cur = NULL;
- if (conn->whichhost + 1 < conn->nconnhost)
+ /* No more addresses to try. So we fail. */
+ goto error_return;
+ }
+ else /* conn->target_session_attrs is prefer-read */
{
- conn->status = CONNECTION_NEEDED;
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
+
+ /*
+ * Finish reading any remaining messages before
+ * being considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
goto keep_going;
}
-
- /* No more addresses to try. So we fail. */
- goto error_return;
}
- PQclear(res);
- termPQExpBuffer(&savedMessage);
+ else /* server support read-write */
+ {
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ {
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* We can release the address lists now. */
- release_all_addrinfo(conn);
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
- /*
- * Finish reading any remaining messages before being
- * considered as ready.
- */
- conn->status = CONNECTION_CONSUME;
- goto keep_going;
+ /*
+ * Finish reading any remaining messages before
+ * being considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+ else /* conn->target_session_attrs is prefer-read */
+ {
+ /* is it the last connection? */
+ if (conn->whichhost + 1 < conn->nconnhost)
+ {
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Not read-only; close connection. */
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a read-only "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /*
+ * Connecting to a writable server, close it
+ * and try to connect to another one.
+ */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+ pqDropConnection(conn, true);
+
+ /* Skip any remaining addresses for this host. */
+ conn->addr_cur = NULL;
+
+ conn->status = CONNECTION_NEEDED;
+
+ /* Record read-write host index, if not yet */
+ if (conn->read_write_host_index < 0)
+ conn->read_write_host_index = conn->whichhost;
+
+ goto keep_going;
+ }
+ else /* No more host to connect, keep this
+ * connection */
+ {
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
+
+ /*
+ * Finish reading any remaining messages
+ * before being considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+ }
+ }
}
/*
@@ -3393,6 +3509,8 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ conn->read_write_host_index = -1;
+
return conn;
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 9a586ff25a..39504999b3 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -362,7 +362,7 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
/* Optional file to write trace info to */
@@ -396,6 +396,7 @@ struct pg_conn
int nconnhost; /* # of possible hosts */
int whichhost; /* host we're currently considering */
pg_conn_host *connhost; /* details about each possible host */
+ int read_write_host_index; /* index for first read-write host in connhost */
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index a0d3e8f357..e994306f28 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 28;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.16.1.windows.4
Haribabu Kommi wrote:
On Wed, Jan 24, 2018 at 9:01 AM Jing Wang <jingwangian@gmail.com> wrote:
Hi All,
Recently I put a proposal to support 'prefer-read' parameter in target_session_attrs in libpq. Now I updated the patch with adding content in the sgml and regression test case.
Some people may have noticed there is already another patch (https://commitfest.postgresql.org/15/1148/ ) which looks similar with this. But I would say this patch is more complex than my proposal.
It is better separate these 2 patches to consider.
I also feel prefer-read and read-only options needs to take as two different options.
prefer-read is simple to support than read-only.Here I attached an updated patch that is rebased to the latest master and also
fixed some of the corner scenarios.
The patch applies, builds and passes "make check-world".
I think the "prefer-read" functionality is desirable: It is exactly what you need
if you want to use replication for load balancing, and your application supports
different database connections for reading and writing queries.
"read-only" does not have a clear use case in my opinion.
With the patch, PostgreSQL behaves as expected if I have a primary and a standby and run:
psql "host=/tmp,/tmp port=5433,5434 target_session_attrs=prefer-read"
But if I stop the standby (port 5434), libpq goes into an endless loop.
Concerning the code:
- The documentation needs some attention. Suggestion:
If this parameter is set to <literal>prefer-read</literal>, connections
where <literal>SHOW transaction_read_only</literal> returns off are preferred.
If no such connection can be found, a connection that allows read-write
transactions will be accepted.
- I think the construction with "read_write_host_index" makes the code even more
complicated than it already is.
What about keeping the first successful connection open and storing it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cached connection,
otherwise use it.
Yours,
Laurenz Albe
On Wed, Jul 4, 2018 at 11:14 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
Haribabu Kommi wrote:
On Wed, Jan 24, 2018 at 9:01 AM Jing Wang <jingwangian@gmail.com> wrote:
Hi All,
Recently I put a proposal to support 'prefer-read' parameter in
target_session_attrs in libpq. Now I updated the patch with adding content
in the sgml and regression test case.Some people may have noticed there is already another patch (
https://commitfest.postgresql.org/15/1148/ ) which looks similar with
this. But I would say this patch is more complex than my proposal.It is better separate these 2 patches to consider.
I also feel prefer-read and read-only options needs to take as two
different options.
prefer-read is simple to support than read-only.
Here I attached an updated patch that is rebased to the latest master
and also
fixed some of the corner scenarios.
Thanks for the review.
The patch applies, builds and passes "make check-world".
I think the "prefer-read" functionality is desirable: It is exactly what
you need
if you want to use replication for load balancing, and your application
supports
different database connections for reading and writing queries."read-only" does not have a clear use case in my opinion.
With the patch, PostgreSQL behaves as expected if I have a primary and a
standby and run:psql "host=/tmp,/tmp port=5433,5434 target_session_attrs=prefer-read"
But if I stop the standby (port 5434), libpq goes into an endless loop.
There was a problem in reusing the primary host index and it leads to loop.
Attached patch fixed the issue.
Concerning the code:
- The documentation needs some attention. Suggestion:
If this parameter is set to <literal>prefer-read</literal>, connections
where <literal>SHOW transaction_read_only</literal> returns off are
preferred.
If no such connection can be found, a connection that allows read-write
transactions will be accepted.
updated as per you comment.
- I think the construction with "read_write_host_index" makes the code
even more
complicated than it already is.What about keeping the first successful connection open and storing it
in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cached
connection,
otherwise use it.
Even if we add a variable to cache the connection, I don't think the logic
of checking
the next host for the read-only host logic may not change, but the extra
connection
request to the read-write host again will be removed.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v2.patchapplication/octet-stream; name=0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v2.patchDownload
From d0b94f2cb8c37e3a7f21c76ff362ca382347aced Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 7 Jun 2018 18:11:51 +1000
Subject: [PATCH] Allow taget-session-attrs to accept prefer-read option
With this new prefer-read option, the connection is preferred
to connect to a read-only server if available in the connection
string, otherwise connect to a read-write server.
---
doc/src/sgml/libpq.sgml | 13 ++-
src/interfaces/libpq/fe-connect.c | 209 ++++++++++++++++++++++++++--------
src/interfaces/libpq/libpq-int.h | 3 +-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
4 files changed, 190 insertions(+), 49 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d67212b831..c89b4267f5 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1592,8 +1592,17 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
successful connection; if it returns <literal>on</literal>, the connection
will be closed. If multiple hosts were specified in the connection
string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
+ attempt had failed.
+ </para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, connections
+ where <literal>SHOW transaction_read_only</literal> returns on are
+ preferred. If no such connections can be found, then a connection
+ that allows read-write transactions will be accepted.
+ </para>
+ <para>
+ The default value of this parameter,<literal>any</literal>, regards all
+ connections as acceptable.
</para>
</listitem>
</varlistentry>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a7e969d7c1..5a693abd56 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -327,7 +327,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1184,7 +1184,8 @@ connectOptions2(PGconn *conn)
if (conn->target_session_attrs)
{
if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ && strcmp(conn->target_session_attrs, "read-write") != 0
+ && strcmp(conn->target_session_attrs, "prefer-read") != 0)
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2086,8 +2087,23 @@ keep_going: /* We will come back to here until there is
{
if (++conn->whichhost >= conn->nconnhost)
{
- conn->whichhost = 0;
- break;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Go to here means failed to connect to
+ * read-only servers and now try connect to
+ * read-write server again. Only under the
+ * 'prefer-read' scenario will go to here.
+ */
+ conn->addr_cur = conn->connhost[conn->read_write_host_index].addrlist;
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ conn->whichhost = 0;
+ break;
+ }
}
conn->addr_cur =
conn->connhost[conn->whichhost].addrlist;
@@ -2112,6 +2128,14 @@ keep_going: /* We will come back to here until there is
conn->addr_cur = addr_cur->ai_next;
continue;
}
+ else if (conn->read_write_host_index >= 0)
+ {
+ conn->addr_cur = conn->connhost[conn->read_write_host_index].addrlist;
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -2;
+ continue;
+ }
+
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
@@ -2341,6 +2365,14 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_NEEDED;
goto keep_going;
}
+ else if (conn->read_write_host_index >= 0)
+ {
+ conn->addr_cur = conn->connhost[conn->read_write_host_index].addrlist;
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -2;
+ conn->status = CONNECTION_NEEDED;
+ goto keep_going;
+ }
goto error_return;
}
@@ -2978,10 +3010,12 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*/
if (conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
/*
* We are yet to make a connection. Save all existing
@@ -3042,10 +3076,12 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is requested check for same.
+ * If a read-write or prefer-read connection is requested check
+ * for same.
*/
if (conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3128,53 +3164,134 @@ keep_going: /* We will come back to here until there is
val = PQgetvalue(res, 0, 0);
if (strncmp(val, "on", 2) == 0)
{
- const char *displayed_host;
- const char *displayed_port;
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ {
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
- PQclear(res);
- restoreErrorMessage(conn, &savedMessage);
+ /* Not writable; close connection. */
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+ pqDropConnection(conn, true);
- /* Not writable; close connection. */
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
- pqDropConnection(conn, true);
+ /* Skip any remaining addresses for this host. */
+ conn->addr_cur = NULL;
+ if (conn->whichhost + 1 < conn->nconnhost)
+ {
+ conn->status = CONNECTION_NEEDED;
+ goto keep_going;
+ }
- /* Skip any remaining addresses for this host. */
- conn->addr_cur = NULL;
- if (conn->whichhost + 1 < conn->nconnhost)
+ /* No more addresses to try. So we fail. */
+ goto error_return;
+ }
+ else /* conn->target_session_attrs is prefer-read */
{
- conn->status = CONNECTION_NEEDED;
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
+
+ /*
+ * Finish reading any remaining messages before
+ * being considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
goto keep_going;
}
-
- /* No more addresses to try. So we fail. */
- goto error_return;
}
- PQclear(res);
- termPQExpBuffer(&savedMessage);
+ else /* server support read-write */
+ {
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ {
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* We can release the address lists now. */
- release_all_addrinfo(conn);
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
- /*
- * Finish reading any remaining messages before being
- * considered as ready.
- */
- conn->status = CONNECTION_CONSUME;
- goto keep_going;
+ /*
+ * Finish reading any remaining messages before
+ * being considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+ else /* conn->target_session_attrs is prefer-read */
+ {
+ /* is it the last connection? */
+ if ((conn->whichhost + 1 < conn->nconnhost) &&
+ (conn->read_write_host_index != -2))
+ {
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Not read-only; close connection. */
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a read-only "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /*
+ * Connecting to a writable server, close it
+ * and try to connect to another one.
+ */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+ pqDropConnection(conn, true);
+
+ /* Skip any remaining addresses for this host. */
+ conn->addr_cur = NULL;
+
+ conn->status = CONNECTION_NEEDED;
+
+ /* Record read-write host index, if not yet */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ goto keep_going;
+ }
+ else /* No more host to connect, keep this
+ * connection */
+ {
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /* We can release the address lists now. */
+ release_all_addrinfo(conn);
+
+ /*
+ * Finish reading any remaining messages
+ * before being considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+ }
+ }
}
/*
@@ -3393,6 +3510,8 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ conn->read_write_host_index = -1;
+
return conn;
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 9a586ff25a..39504999b3 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -362,7 +362,7 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
/* Optional file to write trace info to */
@@ -396,6 +396,7 @@ struct pg_conn
int nconnhost; /* # of possible hosts */
int whichhost; /* host we're currently considering */
pg_conn_host *connhost; /* details about each possible host */
+ int read_write_host_index; /* index for first read-write host in connhost */
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index a0d3e8f357..e994306f28 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 28;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.16.1.windows.4
On Wed, Jul 11, 2018 at 6:00 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Wed, Jul 4, 2018 at 11:14 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:Haribabu Kommi wrote:
- I think the construction with "read_write_host_index" makes the code
even more
complicated than it already is.What about keeping the first successful connection open and storing it
in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cached
connection,
otherwise use it.Even if we add a variable to cache the connection, I don't think the logic
of checking
the next host for the read-only host logic may not change, but the extra
connection
request to the read-write host again will be removed.
I evaluated your suggestion of caching the connection and reuse it when
there is no
read only server doesn't find, but I am thinking that it will add more
complexity and also
the connection to the other servers delays, the cached connection may be
closed by
the server also because of timeout.
I feel the extra time during connection may be fine, if user is preferring
the prefer-read
mode, instead of adding more complexity in handling the cached connection?
comments?
Regards,
Haribabu Kommi
Fujitsu Australia
Haribabu Kommi wrote:
On Wed, Jul 4, 2018 at 11:14 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Haribabu Kommi wrote:
- I think the construction with "read_write_host_index" makes the code even more
complicated than it already is.What about keeping the first successful connection open and storing it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cached connection,
otherwise use it.Even if we add a variable to cache the connection, I don't think the logic of checking
the next host for the read-only host logic may not change, but the extra connection
request to the read-write host again will be removed.I evaluated your suggestion of caching the connection and reuse it when there is no
read only server doesn't find, but I am thinking that it will add more complexity and also
the connection to the other servers delays, the cached connection may be closed by
the server also because of timeout.I feel the extra time during connection may be fine, if user is preferring the prefer-read
mode, instead of adding more complexity in handling the cached connection?comments?
I tested the new patch, and it works as expected.
I don't think that time-out of the cached session is a valid concern, because that
would have to be a really short timeout.
On the other hand, establishing the connection twice (first to check if it is read-only,
then again because no read-only connection is found) can be quite costly.
But that is a matter of debate, as is the readability of the code.
Since I don't think I can contribute more to this patch, I'll mark it as
ready for committer.
Yours,
Laurenz Albe
On Tue, Jul 17, 2018 at 12:42 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
Haribabu Kommi wrote:
On Wed, Jul 4, 2018 at 11:14 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
Haribabu Kommi wrote:
- I think the construction with "read_write_host_index" makes the
code even more
complicated than it already is.
What about keeping the first successful connection open and
storing it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cachedconnection,
otherwise use it.
Even if we add a variable to cache the connection, I don't think the
logic of checking
the next host for the read-only host logic may not change, but the
extra connection
request to the read-write host again will be removed.
I evaluated your suggestion of caching the connection and reuse it when
there is no
read only server doesn't find, but I am thinking that it will add more
complexity and also
the connection to the other servers delays, the cached connection may be
closed by
the server also because of timeout.
I feel the extra time during connection may be fine, if user is
preferring the prefer-read
mode, instead of adding more complexity in handling the cached
connection?
comments?
I tested the new patch, and it works as expected.
Thanks for the confirmation.
I don't think that time-out of the cached session is a valid concern,
because that
would have to be a really short timeout.
On the other hand, establishing the connection twice (first to check if it
is read-only,
then again because no read-only connection is found) can be quite costly.But that is a matter of debate, as is the readability of the code.
Thanks for your opinion, let's wait for opinion from others also.
I can go for the modification, if others also find it useful.
Regards,
Haribabu Kommi
Fujitsu Australia
On Wed, Jul 4, 2018 at 9:14 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
What about keeping the first successful connection open and storing it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cached connection,
otherwise use it.
I like this idea. If I recall correctly, the logic in this area is
getting pretty complex, so we might need to refactor it for better
readability and maintainability.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, Jul 18, 2018 at 10:53 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 4, 2018 at 9:14 AM, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:What about keeping the first successful connection open and storing it
in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cachedconnection,
otherwise use it.
I like this idea. If I recall correctly, the logic in this area is
getting pretty complex, so we might need to refactor it for better
readability and maintainability.
OK. I will work on the code refactoring first and then provide the
prefer-read option on top it.
Regards,
Haribabu Kommi
Fujitsu Australia
On Thu, Jul 19, 2018 at 10:59 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Wed, Jul 18, 2018 at 10:53 PM Robert Haas <robertmhaas@gmail.com>
wrote:On Wed, Jul 4, 2018 at 9:14 AM, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:What about keeping the first successful connection open and storing
it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cachedconnection,
otherwise use it.
I like this idea. If I recall correctly, the logic in this area is
getting pretty complex, so we might need to refactor it for better
readability and maintainability.OK. I will work on the code refactoring first and then provide the
prefer-read option on top it.
commits d1c6a14bacf and 5ca00774194 have refactored the logic
of handling the different connection states.
Attached is a rebased patch after further refactoring the new option
code for easier maintenance.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v3.patchapplication/octet-stream; name=0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v3.patchDownload
From d99b0456390573cb2df3324064e8d87c05cce327 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 7 Jun 2018 18:11:51 +1000
Subject: [PATCH] Allow taget-session-attrs to accept prefer-read option
With this new prefer-read option, the connection is preferred
to connect to a read-only server if available in the connection
string, otherwise connect to a read-write server.
---
doc/src/sgml/libpq.sgml | 13 ++-
src/interfaces/libpq/fe-connect.c | 146 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 2 +
src/interfaces/libpq/libpq-int.h | 3 +-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 152 insertions(+), 26 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 06d909e804..b70cf04a61 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1584,8 +1584,17 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
successful connection; if it returns <literal>on</literal>, the connection
will be closed. If multiple hosts were specified in the connection
string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
+ attempt had failed.
+ </para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, connections
+ where <literal>SHOW transaction_read_only</literal> returns <literal>on</literal>
+ are preferred. If no such connections can be found, then a connection
+ that allows read-write transactions will be accepted.
+ </para>
+ <para>
+ The default value of this parameter,<literal>any</literal>, regards all
+ connections as acceptable.
</para>
</listitem>
</varlistentry>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d001bc513d..93d9abd3fb 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1240,7 +1240,8 @@ connectOptions2(PGconn *conn)
if (conn->target_session_attrs)
{
if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ && strcmp(conn->target_session_attrs, "read-write") != 0
+ && strcmp(conn->target_session_attrs, "prefer-read") != 0)
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2083,6 +2084,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_SSL_STARTUP:
case CONNECTION_NEEDED:
case CONNECTION_CHECK_WRITABLE:
+ case CONNECTION_CHECK_READONLY:
case CONNECTION_CONSUME:
break;
@@ -2123,13 +2125,28 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Go to here means failed to connect to
+ * read-only servers and now try connect to
+ * read-write server again. Only under the
+ * 'prefer-read' scenario will go to here.
+ */
+ conn->whichhost = conn->read_write_host_index;
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is already
+ * set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -2317,6 +2334,7 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
+
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
@@ -3158,7 +3176,8 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required,
+ * see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3166,7 +3185,8 @@ keep_going: /* We will come back to here until there is
*/
if (conn->sversion >= 70400 &&
conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
/*
* Save existing error messages across the PQsendQuery
@@ -3185,11 +3205,40 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
+
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ else
+ conn->status = CONNECTION_CHECK_READONLY;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
+ /*
+ * Requested type is prefer-read then record this connection index
+ * and try the other before considering it later
+ */
+ if ((conn->target_session_attrs != NULL) &&
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0) &&
+ (conn->read_write_host_index != -2))
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record it */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3226,9 +3275,9 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
+ * If a read-write or prefer-read connection is required,
+ * see if we have one. (This should match the stanza in the
+ * CONNECTION_AUTH_OK case above.)
*
* Servers before 7.4 lack the transaction_read_only GUC, but by
* the same token they don't have any read-only mode, so we may
@@ -3236,7 +3285,8 @@ keep_going: /* We will come back to here until there is
*/
if (conn->sversion >= 70400 &&
conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ ((strcmp(conn->target_session_attrs, "read-write") == 0) ||
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0)))
{
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3248,11 +3298,40 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
+
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ else
+ conn->status = CONNECTION_CHECK_READONLY;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
+ /*
+ * Requested type is prefer-read then record this connection index
+ * and try the other before considering it later
+ */
+ if ((conn->target_session_attrs != NULL) &&
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0) &&
+ (conn->read_write_host_index != -2))
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record it */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3291,13 +3370,16 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
case CONNECTION_CHECK_WRITABLE:
+ case CONNECTION_CHECK_READONLY:
{
+ ConnStatusType oldstatus;
const char *displayed_host;
const char *displayed_port;
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
+ oldstatus = conn->status;
conn->status = CONNECTION_OK;
if (!PQconsumeInput(conn))
{
@@ -3307,7 +3389,7 @@ keep_going: /* We will come back to here until there is
if (PQisBusy(conn))
{
- conn->status = CONNECTION_CHECK_WRITABLE;
+ conn->status = oldstatus;
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
@@ -3319,11 +3401,24 @@ keep_going: /* We will come back to here until there is
char *val;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+
+ /*
+ * Server is read-only and requested mode is read-write, ignore it.
+ * Server is read-write and requested mode is prefer-read, record
+ * it for the first time and try to consume in the next scan (it means
+ * no read-only server is found in the first scan).
+ */
+ if (((strncmp(val, "on", 2) == 0) &&
+ (oldstatus == CONNECTION_CHECK_WRITABLE)) ||
+ ((strncmp(val, "off", 3) == 0) &&
+ (oldstatus == CONNECTION_CHECK_READONLY) &&
+ (conn->read_write_host_index != -2)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ const char *type = (oldstatus == CONNECTION_CHECK_READONLY) ?
+ "read-only" : "writable";
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3338,15 +3433,20 @@ keep_going: /* We will come back to here until there is
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
+ libpq_gettext("could not make a %s "
"connection to server "
"\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ type, displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if ((oldstatus == CONNECTION_CHECK_READONLY) &&
+ (conn->read_write_host_index == -1))
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3355,7 +3455,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3568,6 +3668,8 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ conn->read_write_host_index = -1;
+
return conn;
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 52bd5d2cd8..582f83bb0e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,6 +65,8 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
+ CONNECTION_CHECK_READONLY, /* Check if we could make a read-only
+ * connection. */
CONNECTION_CONSUME /* Wait for any pending message and consume
* them. */
} ConnStatusType;
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 975ab33d02..a4716af654 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,7 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
/* Optional file to write trace info to */
@@ -397,6 +397,7 @@ struct pg_conn
int nconnhost; /* # of hosts named in conn string */
int whichhost; /* host we're currently trying/connected to */
pg_conn_host *connhost; /* details about each named host */
+ int read_write_host_index; /* index for first read-write host in connhost */
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 8dff5fc720..8a6edd6867 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.18.0.windows.1
Haribabu Kommi wrote:
On Thu, Jul 19, 2018 at 10:59 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
On Wed, Jul 18, 2018 at 10:53 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Jul 4, 2018 at 9:14 AM, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
What about keeping the first successful connection open and storing it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cached connection,
otherwise use it.I like this idea. If I recall correctly, the logic in this area is
getting pretty complex, so we might need to refactor it for better
readability and maintainability.OK. I will work on the code refactoring first and then provide the
prefer-read option on top it.commits d1c6a14bacf and 5ca00774194 have refactored the logic
of handling the different connection states.Attached is a rebased patch after further refactoring the new option
code for easier maintenance.
The code is much more readable now, thanks.
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -397,6 +397,7 @@ struct pg_conn
int nconnhost; /* # of hosts named in conn string */
int whichhost; /* host we're currently trying/connected to */
pg_conn_host *connhost; /* details about each named host */
+ int read_write_host_index; /* index for first read-write host in connhost */
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
I think the comment could use more love.
This would be the place to document the logic:
Initial value is -1, then then index of the first working server
we found, and -2 for the second attempt to connect to that server.
I notice that you don't keep the first connection open, but close
and reopen it. I guess that is a matter of taste, but it would be
easier on resources (and reduce connection time) if the connection
were kept open.
Admittedly, it would be more difficult and might further complicate
code that is not very clear as it is.
If you work on some more on the comment above, I will mark it as
ready for committer.
Yours,
Laurenz Albe
On Fri, Sep 28, 2018 at 5:31 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Thu, Jul 19, 2018 at 10:59 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Wed, Jul 18, 2018 at 10:53 PM Robert Haas <robertmhaas@gmail.com>
wrote:On Wed, Jul 4, 2018 at 9:14 AM, Laurenz Albe <laurenz.albe@cybertec.at>
wrote:What about keeping the first successful connection open and storing
it in a
variable if we are in "prefer-read" mode.
If we get the read-only connection we desire, close that cachedconnection,
otherwise use it.
I like this idea. If I recall correctly, the logic in this area is
getting pretty complex, so we might need to refactor it for better
readability and maintainability.OK. I will work on the code refactoring first and then provide the
prefer-read option on top it.commits d1c6a14bacf and 5ca00774194 have refactored the logic
of handling the different connection states.Attached is a rebased patch after further refactoring the new option
code for easier maintenance.
[some how i didn't receive this mail, copy pasted from mailing list ]
The code is much more readable now, thanks.
Thanks for the review.
--- a/src/interfaces/libpq/libpq-int.h +++ b/src/interfaces/libpq/libpq-int.h @@ -397,6 +397,7 @@ struct pg_conn int nconnhost; /* # of hosts named in conn string */ int whichhost; /* host we're currently trying/connected
to */
pg_conn_host *connhost; /* details about each named host */
+ int read_write_host_index; /* index for first read-write host
in connhost */
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET ifI think the comment could use more love.
This would be the place to document the logic:
Initial value is -1, then then index of the first working server
we found, and -2 for the second attempt to connect to that server.
Added comments along the lines that you mentioned. And also try
to update some more comments.
I notice that you don't keep the first connection open, but close
and reopen it. I guess that is a matter of taste, but it would be
easier on resources (and reduce connection time) if the connection
were kept open.
Admittedly, it would be more difficult and might further complicate
code that is not very clear as it is.
Yes, I didn't add that logic of keeping the first connection open, Currently
I feel that adds more complexity in supporting the same. If everyone feels
that is required, I will add that logic.
Updated patch attached.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v4.patchapplication/octet-stream; name=0001-Allow-taget-session-attrs-to-accept-prefer-read-opti_v4.patchDownload
From d513e1748677048fe7f4e18419d092247a211c3e Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 7 Jun 2018 18:11:51 +1000
Subject: [PATCH] Allow taget-session-attrs to accept prefer-read option
With this new prefer-read option, the connection is preferred
to connect to a read-only server if available in the connection
string, otherwise connect to a first read-write server in the
connection string.
---
doc/src/sgml/libpq.sgml | 13 ++-
src/interfaces/libpq/fe-connect.c | 150 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 2 +
src/interfaces/libpq/libpq-int.h | 10 +-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 163 insertions(+), 26 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 601091c570..d28dedd942 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1584,8 +1584,17 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
successful connection; if it returns <literal>on</literal>, the connection
will be closed. If multiple hosts were specified in the connection
string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
+ attempt had failed.
+ </para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, connections
+ where <literal>SHOW transaction_read_only</literal> returns <literal>on</literal>
+ are preferred. If no such connections can be found, then a connection
+ that allows read-write transactions will be accepted.
+ </para>
+ <para>
+ The default value of this parameter,<literal>any</literal>, regards all
+ connections as acceptable.
</para>
</listitem>
</varlistentry>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d001bc513d..ed20226c13 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1240,7 +1240,8 @@ connectOptions2(PGconn *conn)
if (conn->target_session_attrs)
{
if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ && strcmp(conn->target_session_attrs, "read-write") != 0
+ && strcmp(conn->target_session_attrs, "prefer-read") != 0)
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2083,6 +2084,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_SSL_STARTUP:
case CONNECTION_NEEDED:
case CONNECTION_CHECK_WRITABLE:
+ case CONNECTION_CHECK_READONLY:
case CONNECTION_CONSUME:
break;
@@ -2123,13 +2125,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is already
+ * set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -2317,6 +2337,7 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
+
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
@@ -3158,7 +3179,8 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required,
+ * see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3166,7 +3188,8 @@ keep_going: /* We will come back to here until there is
*/
if (conn->sversion >= 70400 &&
conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ (strcmp(conn->target_session_attrs, "read-write") == 0 ||
+ strcmp(conn->target_session_attrs, "prefer-read") == 0))
{
/*
* Save existing error messages across the PQsendQuery
@@ -3185,11 +3208,40 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
+
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ else
+ conn->status = CONNECTION_CHECK_READONLY;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if ((conn->target_session_attrs != NULL) &&
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0) &&
+ (conn->read_write_host_index != -2))
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3226,9 +3278,9 @@ keep_going: /* We will come back to here until there is
}
/*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
+ * If a read-write or prefer-read connection is required,
+ * see if we have one. (This should match the stanza in the
+ * CONNECTION_AUTH_OK case above.)
*
* Servers before 7.4 lack the transaction_read_only GUC, but by
* the same token they don't have any read-only mode, so we may
@@ -3236,7 +3288,8 @@ keep_going: /* We will come back to here until there is
*/
if (conn->sversion >= 70400 &&
conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ ((strcmp(conn->target_session_attrs, "read-write") == 0) ||
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0)))
{
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3248,11 +3301,40 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
+
+ if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ else
+ conn->status = CONNECTION_CHECK_READONLY;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if ((conn->target_session_attrs != NULL) &&
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0) &&
+ (conn->read_write_host_index != -2))
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3291,13 +3373,16 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
case CONNECTION_CHECK_WRITABLE:
+ case CONNECTION_CHECK_READONLY:
{
+ ConnStatusType oldstatus;
const char *displayed_host;
const char *displayed_port;
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
+ oldstatus = conn->status;
conn->status = CONNECTION_OK;
if (!PQconsumeInput(conn))
{
@@ -3307,7 +3392,7 @@ keep_going: /* We will come back to here until there is
if (PQisBusy(conn))
{
- conn->status = CONNECTION_CHECK_WRITABLE;
+ conn->status = oldstatus;
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
@@ -3319,11 +3404,24 @@ keep_going: /* We will come back to here until there is
char *val;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+
+ /*
+ * Server is read-only and requested mode is read-write, ignore it.
+ * Server is read-write and requested mode is prefer-read, record
+ * it for the first time and try to consume in the next scan (it means
+ * no read-only server is found in the first scan).
+ */
+ if (((strncmp(val, "on", 2) == 0) &&
+ (oldstatus == CONNECTION_CHECK_WRITABLE)) ||
+ ((strncmp(val, "off", 3) == 0) &&
+ (oldstatus == CONNECTION_CHECK_READONLY) &&
+ (conn->read_write_host_index != -2)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ const char *type = (oldstatus == CONNECTION_CHECK_READONLY) ?
+ "read-only" : "writable";
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3338,15 +3436,20 @@ keep_going: /* We will come back to here until there is
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
+ libpq_gettext("could not make a %s "
"connection to server "
"\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ type, displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if ((oldstatus == CONNECTION_CHECK_READONLY) &&
+ (conn->read_write_host_index == -1))
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3355,7 +3458,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3568,6 +3671,9 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ /* Initial value */
+ conn->read_write_host_index = -1;
+
return conn;
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 52bd5d2cd8..582f83bb0e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,6 +65,8 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
+ CONNECTION_CHECK_READONLY, /* Check if we could make a read-only
+ * connection. */
CONNECTION_CONSUME /* Wait for any pending message and consume
* them. */
} ConnStatusType;
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 975ab33d02..e62378cdea 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,7 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
/* Optional file to write trace info to */
@@ -398,6 +398,14 @@ struct pg_conn
int whichhost; /* host we're currently trying/connected to */
pg_conn_host *connhost; /* details about each named host */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write
+ * host, -2 during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 8dff5fc720..8a6edd6867 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.18.0.windows.1
Haribabu Kommi wrote:
Added comments along the lines that you mentioned. And also try
to update some more comments.
Looks ok to me, I'll mark it as "ready for committer".
Yours,
Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes:
Haribabu Kommi wrote:
Added comments along the lines that you mentioned. And also try
to update some more comments.
Looks ok to me, I'll mark it as "ready for committer".
I don't like this patch at all: the business with keeping two connections
open seems impossibly fragile and full of race conditions. (For instance,
by the time you return the read-only session to the application, it might
not be read-only any more. I also wonder what inquiry functions like
PQsocket ought to return while in this state.) I think the feature
definition needs to be re-thought to make that unnecessary.
Also, we really need to consider the interaction between this and the
feature(s) being discussed in the thread at
/messages/by-id/1700970.cRWpxnom9y@hammer.magicstack.net
regards, tom lane
Tom Lane wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
Haribabu Kommi wrote:
Added comments along the lines that you mentioned. And also try
to update some more comments.Looks ok to me, I'll mark it as "ready for committer".
I don't like this patch at all: the business with keeping two connections
open seems impossibly fragile and full of race conditions. (For instance,
by the time you return the read-only session to the application, it might
not be read-only any more. I also wonder what inquiry functions like
PQsocket ought to return while in this state.) I think the feature
definition needs to be re-thought to make that unnecessary.
As it is now, the patch doesn't keep two connections open. It remembers
the index of the host of the first successful writable connection, but
closes the connection, and opens another one to that host if no read-only
host can be found.
If the read-only connection turns writable after it has been tested,
but before it is returned, that can hardly be avoided.
I don't think that's so bad - after all, you asked for a read-only
connection *if possible*.
If you demand that the server be not promoted until the connection has
been returned to the client, you'd somehow have to block the server
from being promoted, right?
Also, we really need to consider the interaction between this and the
feature(s) being discussed in the thread at
That's a good point.
Yours,
Laurenz Albe
On Tue, Nov 13, 2018 at 7:26 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
Tom Lane wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
Haribabu Kommi wrote:
Added comments along the lines that you mentioned. And also try
to update some more comments.Looks ok to me, I'll mark it as "ready for committer".
I don't like this patch at all: the business with keeping two connections
open seems impossibly fragile and full of race conditions. (Forinstance,
by the time you return the read-only session to the application, it might
not be read-only any more. I also wonder what inquiry functions like
PQsocket ought to return while in this state.) I think the feature
definition needs to be re-thought to make that unnecessary.As it is now, the patch doesn't keep two connections open. It remembers
the index of the host of the first successful writable connection, but
closes the connection, and opens another one to that host if no read-only
host can be found.If the read-only connection turns writable after it has been tested,
but before it is returned, that can hardly be avoided.
I don't think that's so bad - after all, you asked for a read-only
connection *if possible*.
If you demand that the server be not promoted until the connection has
been returned to the client, you'd somehow have to block the server
from being promoted, right?
I also have the same opinion of Laurenz, that this option is letting the
application to connect to read-only server if possible, otherwise let it
connect to read-write server.
I feel that any of the state changes during the connection and after
connection,
needs not to be reflected on the existing connection for these type of
connections.
Also, we really need to consider the interaction between this and the
feature(s) being discussed in the thread at/messages/by-id/1700970.cRWpxnom9y@hammer.magicstack.net
That's a good point.
Thanks for the link. Based on the conclusion on the other thread of
GUC_REPORT,
this patch also can use that logic, but that is limited only till the
connection establishment
for these connection types.
Regards,
Haribabu Kommi
Fujitsu Australia
On Tue, Nov 13, 2018 at 05:54:17PM +1100, Haribabu Kommi wrote:
On Tue, Nov 13, 2018 at 7:26 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:As it is now, the patch doesn't keep two connections open. It remembers
the index of the host of the first successful writable connection, but
closes the connection, and opens another one to that host if no read-only
host can be found.
That's commented in the patch as follows:
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if ((conn->target_session_attrs != NULL) &&
+ (strcmp(conn->target_session_attrs, "prefer-read") == 0) &&
+ (conn->read_write_host_index != -2))
If the read-only connection turns writable after it has been tested,
but before it is returned, that can hardly be avoided.
I don't think that's so bad - after all, you asked for a read-only
connection *if possible*.
Yeah, it is difficult to guarantee that except that checking from time
to time that the connection is still read-only after establishing it.
It is in my opinion mostly a matter of documentation, meaning that the
selection is done when the connection is attempted from the defined
set.
I also have the same opinion of Laurenz, that this option is letting the
application to connect to read-only server if possible, otherwise let it
connect to read-write server.I feel that any of the state changes during the connection and after
connection, needs not to be reflected on the existing connection for
these type of connections.
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
Okay, this gets ugly. I am pretty sure that we should use instead a
status flag and actually avoid any kind of recursion risk in the logic.
Or else it would get hard to track to which value what needs to be set
where in the code.
From purely the code point of view, it seems to me that it is actually
more simple to implement a "read-only" mode as this way there is no need
to mix between CONNECTION_CHECK_READONLY and CONNECTION_CHECK_WRITABLE,
remembering the past index of a connection which may be needed later on
if the next ones don't meet with the wanted conditions.
Each time I have heard about load balancing, applications did not really
care about whether only standbys were used for a set of queries and
accepted that the primary also shared some of the read-only load, be it
for analytics or OLTP, in which case "any" covers already everything
needed. And if you really want a standby, "read-only" would also be
useful so as an application layer can properly fail if there is only a
primary available.
JDBC has its own set of options with targetServerType: master, slave,
secondary, preferSlave and preferSecondary. What's proposed here is
preferSlave and what we would miss is slave at the end.
--
Michael
On Thu, Nov 15, 2018 at 05:14:33PM +0900, Michael Paquier wrote:
JDBC has its own set of options with targetServerType: master, slave,
secondary, preferSlave and preferSecondary. What's proposed here is
preferSlave and what we would miss is slave at the end.
So thinking a couple of extra minutes on this one, I am wondering if it
would be better to close completely the gap with two patches:
1) Get "read-only" done first, which uses most of the existing
infrastructure. That seems simple enough.
2) Get "prefer-read" and "prefer-write", which need some new
infrastructure to track the last preferred connection depending on what
the client wants.
--
Michael
Laurenz Albe <laurenz.albe@cybertec.at> writes:
Tom Lane wrote:
I don't like this patch at all: the business with keeping two connections
open seems impossibly fragile and full of race conditions.
As it is now, the patch doesn't keep two connections open. It remembers
the index of the host of the first successful writable connection, but
closes the connection, and opens another one to that host if no read-only
host can be found.
Oh! The reason I assumed it wasn't doing that is that such a behavior
seems completely insane. If the point is to keep down the load on your
master server, then connecting only to immediately disconnect is not
a friendly way to do that --- even without counting the fact that you
might later come back and connect again.
If that's the best we can do, we should forget the whole feature and
just recommend putting slave servers first in your hosts list when
you want prefer-slave.
regards, tom lane
Tom Lane wrote:
As it is now, the patch doesn't keep two connections open. It remembers
the index of the host of the first successful writable connection, but
closes the connection, and opens another one to that host if no read-only
host can be found.Oh! The reason I assumed it wasn't doing that is that such a behavior
seems completely insane. If the point is to keep down the load on your
master server, then connecting only to immediately disconnect is not
a friendly way to do that --- even without counting the fact that you
might later come back and connect again.
That's why I had argued initially to keep the session open, but you
seem to dislike that idea as well.
If that's the best we can do, we should forget the whole feature and
just recommend putting slave servers first in your hosts list when
you want prefer-slave.
If you know which is which, certainly.
But in a setup with automated failover you cannot be certain which is which.
That's what the proposed feature targets.
Yours,
Laurenz Albe
On Sat, Nov 17, 2018 at 4:56 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:
Tom Lane wrote:
As it is now, the patch doesn't keep two connections open. It
remembers
the index of the host of the first successful writable connection, but
closes the connection, and opens another one to that host if noread-only
host can be found.
Oh! The reason I assumed it wasn't doing that is that such a behavior
seems completely insane. If the point is to keep down the load on your
master server, then connecting only to immediately disconnect is not
a friendly way to do that --- even without counting the fact that you
might later come back and connect again.That's why I had argued initially to keep the session open, but you
seem to dislike that idea as well.
Yes, we need either session open or reconnect it approach to find out
the whether server is read-write or read-only.
And also for read-only or prefer-read connection types, once after the
connection establishment is done, later server promotes to read-write,
I feel we can continue the connection, that decision makes the feature
simple, or do we want to stop the connection?
Regards,
Haribabu Kommi
Fujitsu Australia
On Sat, Nov 17, 2018 at 10:41:54AM +1100, Haribabu Kommi wrote:
Yes, we need either session open or reconnect it approach to find out
the whether server is read-write or read-only.
Even if there is no agreement on this part, wouldn't a read-only option
be enough to support any case? With a cluster of two nodes, one primary
and one standby, a connection string listing both nodes would fail only
in the middle of a planned failover. If you rinse and repeat it is
possible to have a larger control at application level because you
precisely know the whole state of the cluster at a given instant.
And also for read-only or prefer-read connection types, once after the
connection establishment is done, later server promotes to read-write,
I feel we can continue the connection, that decision makes the feature
simple, or do we want to stop the connection?
That feels like something the application needs to care about once the
session is established, but that's only my take on the matter.
--
Michael
Tom>Yes, we need either session open or reconnect it approach to find out
Tom>the whether server is read-write or read-only.
Just in case, pgjdbc has that feature for quite a while, and the behavior
there is to keep the connection until it fails or application decides to
close it.
pgjdbc uses three parameters (since 2014):
1) targetServerType=(any | master | secondary | preferSecondary). Default
is "any". When set to "master" it will look for "read-write" server. If set
to "preferSecondary" it would search for "read-only" server first, then
fall back to master, and so on.
2) loadBalanceHosts=(true | false). pgjdbc enables to load-balance across
servers provided in the connection URL. When set to "false", pgjdbc tries
connections in order, otherwise it shuffles the connections.
3) hostRecheckSeconds=int. pgjdbc caches "read/write" status of a host:port
combination, so it don't re-check the status if multiple connections are
created within hostRecheckSeconds timeframe.
It is sad that essentially the same feature is re-implemented in core with
different name/semantics.
Does it make sense to align parameter names/semantics?
Vladimir
On Tue, 20 Nov 2018 at 06:23, Vladimir Sitnikov <sitnikov.vladimir@gmail.com>
wrote:
Tom>Yes, we need either session open or reconnect it approach to find out
Tom>the whether server is read-write or read-only.Just in case, pgjdbc has that feature for quite a while, and the behavior
there is to keep the connection until it fails or application decides to
close it.pgjdbc uses three parameters (since 2014):
1) targetServerType=(any | master | secondary | preferSecondary). Default
is "any". When set to "master" it will look for "read-write" server. If set
to "preferSecondary" it would search for "read-only" server first, then
fall back to master, and so on.
2) loadBalanceHosts=(true | false). pgjdbc enables to load-balance across
servers provided in the connection URL. When set to "false", pgjdbc tries
connections in order, otherwise it shuffles the connections.
3) hostRecheckSeconds=int. pgjdbc caches "read/write" status of a
host:port combination, so it don't re-check the status if multiple
connections are created within hostRecheckSeconds timeframe.It is sad that essentially the same feature is re-implemented in core with
different name/semantics.
Does it make sense to align parameter names/semantics?
Looking at
/messages/by-id/1700970.cRWpxnom9y@hammer.magicstack.net
Which Tom points out as being relevant to this discussion ISTM that this is
becoming a half baked "feature" that is being cobbled together instead of
being designed. Admittedly biased but I agree with Vladimir that libpq did
not implement the above feature using the same name and semantics. This
just serves to confuse the users.
Just my 2c worth
Dave Cramer
On Fri, Nov 16, 2018 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Oh! The reason I assumed it wasn't doing that is that such a behavior
seems completely insane. If the point is to keep down the load on your
master server, then connecting only to immediately disconnect is not
a friendly way to do that --- even without counting the fact that you
might later come back and connect again.
That seems like a really weak argument. Opening a connection to the
master surely isn't free, but it must be vastly cheaper than the cost
of the queries you intend to run. I mean, no reasonable production
user of PostgreSQL opens a connection, runs one or two short queries,
and then closes the connection. You open a connection and keep it
open for minutes, hours, days, or longer, running hundreds, thousands,
or millions of queries. The cost of checking whether you've got a
master or a standby is a drop in the bucket.
And, I mean, if there's some scenario where what I just said isn't
true, well then don't use this feature in that particular case.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Wed, 21 Nov 2018 at 09:05, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Nov 16, 2018 at 11:35 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Oh! The reason I assumed it wasn't doing that is that such a behavior
seems completely insane. If the point is to keep down the load on your
master server, then connecting only to immediately disconnect is not
a friendly way to do that --- even without counting the fact that you
might later come back and connect again.That seems like a really weak argument. Opening a connection to the
master surely isn't free, but it must be vastly cheaper than the cost
of the queries you intend to run. I mean, no reasonable production
user of PostgreSQL opens a connection, runs one or two short queries,
and then closes the connection. You open a connection and keep it
open for minutes, hours, days, or longer, running hundreds, thousands,
or millions of queries. The cost of checking whether you've got a
master or a standby is a drop in the bucket.And, I mean, if there's some scenario where what I just said isn't
true, well then don't use this feature in that particular case.
And to enforce Robert's argument even further almost every pool
implementation I am aware of
has a keep alive query. So why not use the opportunity to check to see if
is a primary or standby at the same time
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Show quoted text
This patch is marked as "ready for committer", but that characterization
seems way over-optimistic. It looks like there are several unsettled
questions:
1. The connection parameter name and values are unlike the very similar
feature in pgJDBC. I think this is a fair complaint. Now I'm not in love
with "hostRecheckSeconds" --- that seems like a very squishily defined
thing with limited use-case, given Robert's argument that you shouldn't
be using this feature at all for short-lived connections. And
"loadBalanceHosts" is something we could leave for later. But it seems
pretty unfortunate not to follow pgJDBC's lead on the main parameter,
"targetServerType=(any | master | secondary | preferSecondary)".
The problem here of course is that whoever invented target_session_attrs
was unconcerned with following that precedent, so what we have is
"target_session_attrs=(any | read-write)".
Are we prepared to add some aliases in service of unifying these names?
2. Whether or not you want to follow pgJDBC's naming, it seems like we
ought to have both "require read only" and "prefer read only" behaviors
in this patch, and maybe likewise "require read write" versus "prefer
read write".
3. We ought to sync this up with whatever's going to happen in
https://commitfest.postgresql.org/21/1090/
at least to the extent of agreeing on what GUCs we'd like to see
the server start reporting.
4. Given that other discussion, it's not quite clear what we should
even be checking. The existing logic devolves to checking that
transaction_read_only is true, but that's not really the same thing as
"is a master server", eg you might have connected to a master server
under a role that has SET ROLE default_transaction_read_only = false.
(I wonder what pgJDBC is really checking, under the hood.)
Do we want to have modes that are checking hot-standby state in some
fashion, rather than the transaction_read_only state?
regards, tom lane
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
The problem here of course is that whoever invented target_session_attrs
was unconcerned with following that precedent, so what we have is
"target_session_attrs=(any | read-write)".
Are we prepared to add some aliases in service of unifying these names?
I think "yes".
2. Whether or not you want to follow pgJDBC's naming, it seems like we
ought to have both "require read only" and "prefer read only" behaviors
in this patch, and maybe likewise "require read write" versus "prefer
read write".
Agreed, although I don't see a use case for "prefer read write". I don't think there's an app like "I want to write, but I'm OK if I cannot."
3. We ought to sync this up with whatever's going to happen in
https://commitfest.postgresql.org/21/1090/
at least to the extent of agreeing on what GUCs we'd like to see
the server start reporting.
Yes.
4. Given that other discussion, it's not quite clear what we should
even be checking. The existing logic devolves to checking that
transaction_read_only is true, but that's not really the same thing as
"is a master server", eg you might have connected to a master server
under a role that has SET ROLE default_transaction_read_only = false.
(I wonder what pgJDBC is really checking, under the hood.)
Do we want to have modes that are checking hot-standby state in some
fashion, rather than the transaction_read_only state?
PgJDBC uses transaction_read_only like this:
[core/v3/ConnectionFactoryImpl.java]
private boolean isMaster(QueryExecutor queryExecutor) throws SQLException, IOException {
byte[][] results = SetupQueryRunner.run(queryExecutor, "show transaction_read_only", true);
String value = queryExecutor.getEncoding().decode(results[0]);
return value.equalsIgnoreCase("off");
}
But as some people said, I don't think this is the right way. I suppose what's leading to the current somewhat complicated situation is that there was no easy way for the client to know whether the server is the master. That ended up in using "SHOW transaction_read_only" instead, and people supported that compromise by saying "read only status is more useful than whether the server is standby or not," I'm afraid.
The original desire should have been the ability to connect to a primary or a standby. So, I think we should go back to the original thinking (and not complicate the feature), and create a read only GUC_REPORT variable, say, server_role, that identifies whether the server is a primary or a standby.
Regards
Takayuki Tsunakawa
On Tue, Jan 15, 2019 at 02:00:57AM +0000, Tsunakawa, Takayuki wrote:
But as some people said, I don't think this is the right way. I
suppose what's leading to the current somewhat complicated situation
is that there was no easy way for the client to know whether the
server is the master. That ended up in using "SHOW
transaction_read_only" instead, and people supported that compromise
by saying "read only status is more useful than whether the server
is standby or not," I'm afraid.
Right. Another pin point is that it is complicated for a client to be
sure that it is connected to a standby as at the time between
transaction_read_only is checked and the connection is reported as
ready to be used for the application, you may be actually linked to a
primary which has just recently been promoted. I am not personally
sure if it is worth caring about that in such level of details to get
to get something useful, but there have been doubts about not making
that absolutely right to leverage correctly applications willing to
use read-only clients.
The original desire should have been the ability to connect to a
primary or a standby. So, I think we should go back to the original
thinking (and not complicate the feature), and create a read only
GUC_REPORT variable, say, server_role, that identifies whether the
server is a primary or a standby.
From the point of view of making sure that a client is really
connected to a primary or a standby, this is the best idea around.
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Tue, Jan 15, 2019 at 02:00:57AM +0000, Tsunakawa, Takayuki wrote:
The original desire should have been the ability to connect to a
primary or a standby. So, I think we should go back to the original
thinking (and not complicate the feature), and create a read only
GUC_REPORT variable, say, server_role, that identifies whether the
server is a primary or a standby.
From the point of view of making sure that a client is really
connected to a primary or a standby, this is the best idea around.
There are a couple of issues here:
1. Are you sure there are no use-cases for testing transaction_read_only
as such?
2. What will the fallback implementation be, when connecting to a server
too old to have the variable you want?
regards, tom lane
On Mon, 14 Jan 2019 at 21:19, Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
The problem here of course is that whoever invented target_session_attrs
was unconcerned with following that precedent, so what we have is
"target_session_attrs=(any | read-write)".
Are we prepared to add some aliases in service of unifying these names?I think "yes".
Agreed. There's no downside to aliasing and I'd really like to see
consistency.
2. Whether or not you want to follow pgJDBC's naming, it seems like we
ought to have both "require read only" and "prefer read only" behaviors
in this patch, and maybe likewise "require read write" versus "prefer
read write".Agreed, although I don't see a use case for "prefer read write". I don't
think there's an app like "I want to write, but I'm OK if I cannot."
3. We ought to sync this up with whatever's going to happen in
https://commitfest.postgresql.org/21/1090/
at least to the extent of agreeing on what GUCs we'd like to see
the server start reporting.Yes.
4. Given that other discussion, it's not quite clear what we should
even be checking. The existing logic devolves to checking that
transaction_read_only is true, but that's not really the same thing as
"is a master server", eg you might have connected to a master server
under a role that has SET ROLE default_transaction_read_only = false.
(I wonder what pgJDBC is really checking, under the hood.)
Do we want to have modes that are checking hot-standby state in some
fashion, rather than the transaction_read_only state?PgJDBC uses transaction_read_only like this:
[core/v3/ConnectionFactoryImpl.java]
private boolean isMaster(QueryExecutor queryExecutor) throws
SQLException, IOException {
byte[][] results = SetupQueryRunner.run(queryExecutor, "show
transaction_read_only", true);
String value = queryExecutor.getEncoding().decode(results[0]);
return value.equalsIgnoreCase("off");
}But as some people said, I don't think this is the right way. I suppose
what's leading to the current somewhat complicated situation is that there
was no easy way for the client to know whether the server is the master.
That ended up in using "SHOW transaction_read_only" instead, and people
supported that compromise by saying "read only status is more useful than
whether the server is standby or not," I'm afraid.The original desire should have been the ability to connect to a primary
or a standby. So, I think we should go back to the original thinking (and
not complicate the feature), and create a read only GUC_REPORT variable,
say, server_role, that identifies whether the server is a primary or a
standby.I'm confused as to how this would work. Who or what determines if the
server is a primary or standby?
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Show quoted text
On Mon, Jan 14, 2019 at 5:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
The problem here of course is that whoever invented target_session_attrs
was unconcerned with following that precedent, so what we have is
"target_session_attrs=(any | read-write)".
Are we prepared to add some aliases in service of unifying these names?
I wasn't unconcerned about the problem, but I wasn't prepared to to be
the first person who added a connection parameter that used
namesLikeThis instead of names_like_this, especially if the semantics
weren't exactly the same. That seemed to be a recipe for somebody
yelling at me, and I try to avoid that when I can.
4. Given that other discussion, it's not quite clear what we should
even be checking. The existing logic devolves to checking that
transaction_read_only is true, but that's not really the same thing as
"is a master server", eg you might have connected to a master server
under a role that has SET ROLE default_transaction_read_only = false.
(I wonder what pgJDBC is really checking, under the hood.)
Do we want to have modes that are checking hot-standby state in some
fashion, rather than the transaction_read_only state?
Well, this has been discussed before, too, I'm pretty sure, but I'm
too lazy to go find the old discussion right now. The upshot is that
default_transaction_read_only lets an administrator make a server look
read-only even if it technically isn't, which somebody might find
useful. Otherwise what do you do if, for example, you are using
logical replication? None of your servers are in recovery, but you
can make some of them report default_transaction_read_only = true if
you like. To me, that kind of configurability is a feature, not a
bug.
That being said, I don't object to having even more values for
target_session_attrs that check other things. You could have:
read_only: default_transaction_read_only => true
read_write: default_transaction_read_only => false
master: pg_is_in_recovery => false
standby: pg_is_in_recovery => true
But what I think would be a Very Bad Plan is to use confused naming
that looks for something different than what it purports to do. For
example, if you were to change things so that read_write checks
pg_is_in_recovery(), then you might ask for a "read-write" server and
get one where only read-only transactions are permitted. We need not
assume that "read-write master" and "read-only standby" are the only
two kinds of things that can ever exist, as long as we're careful
about the names we choose. Choosing the names carefully also helps to
avoid POLA violations.
Another point I'd like to mention is that target_session_attrs could
be extended to care about other kinds of properties which someone
might want a server to have, quite apart from
master/standby/read-only/read-write. I don't know exactly what sort
of thing somebody might care about, but the name is such that we can
decide to care about other properties in the future without having to
add a whole new parameter. You can imagine a day when someone can say
target_session_attrs=read-write,v42+,ftl to get a server connection
that is read-write on a server running PostgreSQL 42 or greater that
also has a built-in hyperdrive. Or whatever.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Michael Paquier <michael@paquier.xyz> writes:
On Tue, Jan 15, 2019 at 02:00:57AM +0000, Tsunakawa, Takayuki wrote:
The original desire should have been the ability to connect to a
primary or a standby. So, I think we should go back to the original
thinking (and not complicate the feature), and create a read only
GUC_REPORT variable, say, server_role, that identifies whether the
server is a primary or a standby.From the point of view of making sure that a client is really
connected to a primary or a standby, this is the best idea around.There are a couple of issues here:
1. Are you sure there are no use-cases for testing transaction_read_only
as such?
I don't find any practical use case, but I won't object to leaving the current target_session_attrs as-is. Alide from that, I think a parameter like PgJDBC's is necessary, e.g., target_server_type = {primary | standby | prefer_standby}, which acts based on just the server role (primary or standby).
2. What will the fallback implementation be, when connecting to a server
too old to have the variable you want?
One of the following:
1. "Unsupported" error. I'll take this.
2. libpq issues "SELECT pg_is_in_recovery()".
3. Blindly accepts the first successful connection.
Regards
Takayuki Tsunakawa
From: Dave Cramer [mailto:pg@fastcrypt.com]
The original desire should have been the ability to connect to a
primary or a standby. So, I think we should go back to the original thinking
(and not complicate the feature), and create a read only GUC_REPORT variable,
say, server_role, that identifies whether the server is a primary or a
standby.I'm confused as to how this would work. Who or what determines if the server
is a primary or standby?
Overall, the server determines the server role (primary or standby) using the same mechanism as pg_is_in_recovery(), and set the server_role GUC parameter. As the parameter is GUC_REPORT, the change is reported to the clients using the ParameterStatus ('S') message. The clients also get the value at connection.
Regards
Takayuki Tsunakawa
I'm confused as to how this would work. Who or what determines if the server
is a primary or standby?Overall, the server determines the server role (primary or standby) using the same mechanism as pg_is_in_recovery(), and set the server_role GUC parameter. As the parameter is GUC_REPORT, the change is reported to the clients using the ParameterStatus ('S') message. The clients also get the value at connection.
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpq would
need to handle it but I doubt it would be possible.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpq would
need to handle it but I doubt it would be possible.
Yes, the application needs to retry connection attempts until success. That's not different from PgJDBC and other DBMSs.
Regards
Takayuki Tsunakawa
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpq would
need to handle it but I doubt it would be possible.Yes, the application needs to retry connection attempts until success. That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.
1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.
2) If there's no node on which pg_is_in_recovery() returns false, then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false, then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.
It may be convenient for libpq to be able to retry connection attempts for a specified duration (by failover_timeout or such), because it eliminates the need for the application to do the retry. But I think it's a desirable feature, not a required one.
Regards
Takayuki Tsunakawa
Tsunakawa, Takayuki wrote:
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
The problem here of course is that whoever invented target_session_attrs
was unconcerned with following that precedent, so what we have is
"target_session_attrs=(any | read-write)".
Are we prepared to add some aliases in service of unifying these names?I think "yes".
2. Whether or not you want to follow pgJDBC's naming, it seems like we
ought to have both "require read only" and "prefer read only" behaviors
in this patch, and maybe likewise "require read write" versus "prefer
read write".Agreed, although I don't see a use case for "prefer read write". I don't think
there's an app like "I want to write, but I'm OK if I cannot."
I don't think so either, although of course I cannot prove it.
My opinion is that we shouldn't add options like "prefer read write"
just out of a fuzzy desire for symmetry. It would probably make the code
even more complicated, and more choice means that it becomes harder for
the user to pick the right one (the latter may be a weak argument).
The motivation behind all this is to load balance reading and writing
sessions among a group of replicating servers where you don't know for sure
who is what at the moment, so "preferably read-only", "must be able to write"
and "don't care" are choice enough.
There is nothing that bars future patches from adding additional modes
if the need really arises.
4. Given that other discussion, it's not quite clear what we should
even be checking. The existing logic devolves to checking that
transaction_read_only is true, but that's not really the same thing as
"is a master server", eg you might have connected to a master server
under a role that has SET ROLE default_transaction_read_only = false.PgJDBC uses transaction_read_only like this: [...]
But as some people said, I don't think this is the right way. I suppose what's leading
to the current somewhat complicated situation is that there was no easy way for the
client to know whether the server is the master. That ended up in using
"SHOW transaction_read_only" instead, and people supported that compromise by saying
"read only status is more useful than whether the server is standby or not," I'm afraid.The original desire should have been the ability to connect to a primary or a standby.
So, I think we should go back to the original thinking (and not complicate the feature),
and create a read only GUC_REPORT variable, say, server_role, that identifies whether
the server is a primary or a standby.
I think that transaction_read_only is good.
If it is set to false, we are sure to be on a replication primary or
stand-alone server, which is enough to know for the load balancing use case.
I deem it unlikely that someone will set default_transaction_read_only to
FALSE and then complain that the feature is not working as expected, but again
I cannot prove that claim.
As Robert said, transaction_read_only might even give you the option to
use the feature for more than just load balancing between replication master and standby.
Yours,
Laurenz Albe
On Tue, 15 Jan 2019 at 23:21, Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Dave Cramer [mailto:pg@fastcrypt.com]
The original desire should have been the ability to connect to a
primary or a standby. So, I think we should go back to the originalthinking
(and not complicate the feature), and create a read only GUC_REPORT
variable,
say, server_role, that identifies whether the server is a primary or a
standby.I'm confused as to how this would work. Who or what determines if the
server
is a primary or standby?
Overall, the server determines the server role (primary or standby) using
the same mechanism as pg_is_in_recovery(), and set the server_role GUC
parameter. As the parameter is GUC_REPORT, the change is reported to the
clients using the ParameterStatus ('S') message. The clients also get the
value at connection.
Thanks, that clarifies it.
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Show quoted text
On Wed, 16 Jan 2019 at 01:02, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpq would
need to handle it but I doubt it would be possible.Yes, the application needs to retry connection attempts until success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false, then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.
IIRC this is essentially what pgJDBC does.
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Show quoted text
On Thu, 17 Jan 2019 at 05:59, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
Tsunakawa, Takayuki wrote:
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
The problem here of course is that whoever invented
target_session_attrs
was unconcerned with following that precedent, so what we have is
"target_session_attrs=(any | read-write)".
Are we prepared to add some aliases in service of unifying these names?I think "yes".
2. Whether or not you want to follow pgJDBC's naming, it seems like we
ought to have both "require read only" and "prefer read only" behaviors
in this patch, and maybe likewise "require read write" versus "prefer
read write".
I just had a look at the JDBC code there is no prefer read write. There is
a "preferSecondary"
The logic behind this is that the connection would presumably be only doing
reads so ideally it would like a secondary,
but if it can't find one it will connect to a primary.
To be clear there are 4 target server types in pgJDBC, "any",
"master","secondary", and "preferSecondary" (looking at this I need to
alias master to primary, but that's another discussion)
I have no idea where "I want to write but I'm OK if I cannot came from"?
Dave
On Wed, 16 Jan 2019 at 01:02, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpq would
need to handle it but I doubt it would be possible.Yes, the application needs to retry connection attempts until success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false, then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.IIRC this is essentially what pgJDBC does.
Thanks for clarifying that. Pgpool-II also does that too. Seems like a
common technique to find out a primary node.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
On Thu, 17 Jan 2019 at 18:03, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
On Wed, 16 Jan 2019 at 01:02, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpqwould
need to handle it but I doubt it would be possible.
Yes, the application needs to retry connection attempts until success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false, then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.IIRC this is essentially what pgJDBC does.
Thanks for clarifying that. Pgpool-II also does that too. Seems like a
common technique to find out a primary node.
Checking the code I see we actually use show transaction_read_only.
Sorry for the confusion
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Show quoted text
From: Dave Cramer [mailto:pg@fastcrypt.com]
2) If there's no node on which pg_is_in_recovery() returns false,
then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.
Checking the code I see we actually use show transaction_read_only.
Also, does PgJDBC really repeat connection attempts for a user-specified duration? Having a quick look at the code, it seemed to try each host once in a while loop.
Regards
Takayuki Tsunakawa
On Thu, 17 Jan 2019 at 18:03, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
On Wed, 16 Jan 2019 at 01:02, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am not sure
if backend out to be responsible for this process. If not, libpqwould
need to handle it but I doubt it would be possible.
Yes, the application needs to retry connection attempts until success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false, then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.IIRC this is essentially what pgJDBC does.
Thanks for clarifying that. Pgpool-II also does that too. Seems like a
common technique to find out a primary node.Checking the code I see we actually use show transaction_read_only.
Sorry for the confusion
So if all PostgreSQL servers returns transaction_read_only = on, how
does pgJDBC find the primary node?
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
On Thu, 17 Jan 2019 at 19:15, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
On Thu, 17 Jan 2019 at 18:03, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
On Wed, 16 Jan 2019 at 01:02, Tatsuo Ishii <ishii@sraoss.co.jp>
wrote:
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting
standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am notsure
if backend out to be responsible for this process. If not, libpq
would
need to handle it but I doubt it would be possible.
Yes, the application needs to retry connection attempts until
success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false,
then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.IIRC this is essentially what pgJDBC does.
Thanks for clarifying that. Pgpool-II also does that too. Seems like a
common technique to find out a primary node.Checking the code I see we actually use show transaction_read_only.
Sorry for the confusion
So if all PostgreSQL servers returns transaction_read_only = on, how
does pgJDBC find the primary node?well preferSecondary would return a connection.
I'm curious; under what circumstances would the above occur?
Regards,
Dave
On Thu, 17 Jan 2019 at 19:09, Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Dave Cramer [mailto:pg@fastcrypt.com]
2) If there's no node on which pg_is_in_recovery() returns
false,
then
we need to retry until we find it. To not retry forever,
there
should be a timeout counter parameter.
Checking the code I see we actually use show transaction_read_only.
Also, does PgJDBC really repeat connection attempts for a user-specified
duration? Having a quick look at the code, it seemed to try each host once
in a while loop.
You are correct looking at the code again. On the initial connection
attempt we only try once.
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
Show quoted text
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting
standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I am notsure
if backend out to be responsible for this process. If not, libpq
would
need to handle it but I doubt it would be possible.
Yes, the application needs to retry connection attempts until
success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to do
more than just retrying.1) Try to find a node on which pg_is_in_recovery() returns false. If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false,
then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.IIRC this is essentially what pgJDBC does.
Thanks for clarifying that. Pgpool-II also does that too. Seems like a
common technique to find out a primary node.Checking the code I see we actually use show transaction_read_only.
Sorry for the confusion
So if all PostgreSQL servers returns transaction_read_only = on, how
does pgJDBC find the primary node?well preferSecondary would return a connection.
This is not my message :-)
I'm curious; under what circumstances would the above occur?
Former primary goes down and one of standbys is promoting but it is
not promoted to new primary yet.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
On Thu, 17 Jan 2019 at 19:38, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
From: Tatsuo Ishii [mailto:ishii@sraoss.co.jp]
But pg_is_in_recovery() returns true even for a promoting
standby. So
you have to wait and retry to send pg_is_in_recovery() until it
finishes the promotion to find out it is now a primary. I amnot
sure
if backend out to be responsible for this process. If not,
libpq
would
need to handle it but I doubt it would be possible.
Yes, the application needs to retry connection attempts until
success.
That's not different from PgJDBC and other DBMSs.
I don't know what PgJDBC is doing, however I think libpq needs to
do
more than just retrying.
1) Try to find a node on which pg_is_in_recovery() returns false.
If
found, then we assume that is the primary. We also assume that
other nodes are standbys. done.2) If there's no node on which pg_is_in_recovery() returns false,
then
we need to retry until we find it. To not retry forever, there
should be a timeout counter parameter.IIRC this is essentially what pgJDBC does.
Thanks for clarifying that. Pgpool-II also does that too. Seems like
a
common technique to find out a primary node.
Checking the code I see we actually use show transaction_read_only.
Sorry for the confusion
So if all PostgreSQL servers returns transaction_read_only = on, how
does pgJDBC find the primary node?well preferSecondary would return a connection.
This is not my message :-)
I'm curious; under what circumstances would the above occur?
Former primary goes down and one of standbys is promoting but it is
not promoted to new primary yet.
seems like JDBC might have some work to do...Thanks
I'm going to wait to implement until we resolve this discussion
Dave
Show quoted text
I'm curious; under what circumstances would the above occur?
Former primary goes down and one of standbys is promoting but it is
not promoted to new primary yet.seems like JDBC might have some work to do...Thanks
I'm going to wait to implement until we resolve this discussion
If you need some input from me regarding finding a primary node,
please say so. While working on Pgpool-II project, I learned the
necessity in a hard way.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
On Thu, 17 Jan 2019 at 19:56, Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
I'm curious; under what circumstances would the above occur?
Former primary goes down and one of standbys is promoting but it is
not promoted to new primary yet.seems like JDBC might have some work to do...Thanks
I'm going to wait to implement until we resolve this discussion
If you need some input from me regarding finding a primary node,
please say so. While working on Pgpool-II project, I learned the
necessity in a hard way.
I would really like to have a consistent way of doing this, and consistent
terms for the connection parameters.
that said yes, I would like input from you.
Thanks,
Dave
From: Laurenz Albe [mailto:laurenz.albe@cybertec.at]
I think that transaction_read_only is good.
If it is set to false, we are sure to be on a replication primary or
stand-alone server, which is enough to know for the load balancing use case.
As Tatsuo-san said, setting default_transaction_read_only leads to a misjudgement of the primary.
I deem it unlikely that someone will set default_transaction_read_only to
FALSE and then complain that the feature is not working as expected, but
again
I cannot prove that claim.
I wonder what default_transaction_read_only exists for. For maing the database by default and allowing only specific users to write to the database with "CREATE/ALTER USER SET default_transaction_read_only = false"?
I'm sorry to repeat myself, but anyway, I think we need a method to connect to a standby as the original desire, because the primary instance may be read only by default while only limited users update data. That's for reducing the burdon on the primary and minimizing the impact on users who update data. For example,
* run data reporting on the standby
* backup the database from the standby
* cascade replication from the standby
As Robert said, transaction_read_only might even give you the option to
use the feature for more than just load balancing between replication master
and standby.
What use case do you think of? If you want to load balance the workload between the primary and standbys, we can follow PgJDBC -- targetServerType=any.
Regards
Takayuki Tsunakawa
On Fri, Jan 18, 2019 at 2:34 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Laurenz Albe [mailto:laurenz.albe@cybertec.at]
I think that transaction_read_only is good.
If it is set to false, we are sure to be on a replication primary or
stand-alone server, which is enough to know for the load balancing usecase.
As Tatsuo-san said, setting default_transaction_read_only leads to a
misjudgement of the primary.I deem it unlikely that someone will set default_transaction_read_only to
FALSE and then complain that the feature is not working as expected, but
again
I cannot prove that claim.I wonder what default_transaction_read_only exists for. For maing the
database by default and allowing only specific users to write to the
database with "CREATE/ALTER USER SET default_transaction_read_only = false"?
default_transaction_read_only is a user settable parameter, even if it set
as true by default,
any user can change it later. Deciding server type based on this whether it
supports read-write
or read-only can go wrong, as the user can change it later.
I'm sorry to repeat myself, but anyway, I think we need a method to
connect to a standby as the original desire, because the primary instance
may be read only by default while only limited users update data. That's
for reducing the burdon on the primary and minimizing the impact on users
who update data. For example,* run data reporting on the standby
* backup the database from the standby
* cascade replication from the standby
IMO, if we try to use only pg_is_in_recovery() only to decide to connect,
we may not
support all the target_session_attrs that are possible. how about using
both to decide?
Master/read-write -- recovery = false and default_transaction_read_only =
false
Standby/read-only -- recovery = true
prefer-standby/prefer-read -- recovery = true or
default_transaction_read_only = true
any -- Nothing to be verified
I feel above verifications can cover for both physical and logical
replication.
we can decide what type of options that we can support? and also if we
don't want to rely on default_transaction_read_only user settable parameter,
we can add a new parameter that cannot be changed only with server restart?
Regards,
Haribabu Kommi
Fujitsu Australia
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
IMO, if we try to use only pg_is_in_recovery() only to decide to connect,
we may not
support all the target_session_attrs that are possible. how about using
both to decide?
I favor adding a new parameter like target_server_type whose purpose is to select the server role. That aligns better with the PgJDBC's naming, which conveys consistency to PostgreSQL users. Again, the original desire should have been to connect to a standby to eliminate the burdon on the primary, where the primary may be read-only by default and only a limited group of users are allowed to write to the database.
I don't know what kind of realistic use cases there are that request read-only session in the logical replication configuration. I think we can just leave target_session_attrs as what it is now in PostgreSQL 11, for compatibility and possibly future new use cases.
Master/read-write -- recovery = false and default_transaction_read_only
= false
Standby/read-only -- recovery = true
prefer-standby/prefer-read -- recovery = true or
default_transaction_read_only = true
any -- Nothing to be verifiedI feel above verifications can cover for both physical and logical
replication.
As for prefer-standby/prefer-read, if host parameter specifies host1,host2 in this order, and host1 is the primary with default_transaction_read_only=true, does the app get a connection to host1? I want to connect to host2 (standby) only if host1 is down.
Regards
Takayuki Tsunakawa
If you need some input from me regarding finding a primary node,
please say so. While working on Pgpool-II project, I learned the
necessity in a hard way.I would really like to have a consistent way of doing this, and consistent
terms for the connection parameters.that said yes, I would like input from you.
Sure, no problem.
- Upon Pgpool-II starting up or recieving failover event or switch
over event, primary node finding is executed.
- It repeats following until timeout parameter
("search_primary_node_timeout" is expired)
do until the timeout is expired
{
for all_live_backends
{
connect to the backend.
execute "SELECT pg_is_in_recovery()".
if it returns false, the we find the primary node. Assume
other backend as standbys and we are done.
disconnect to the backend
}
sleep 1 second;
}
If no primary node was found, all backends are regarded as standbys.
In addition to above, recent Pgpool-II versions does optional checking
to verify backend status, for example, finding a case where there
are two primary nodes.
- If there are two primaries, check the connectivity between each
primary and standbys using pg_stat_wal_receiver() (so this can not
be executed with PostgreSQL version 9.5 or before)
- If there's a primary (call it "A") which is not connected to any of
standbys while there's a primary (call it "B") which is connected to
all of standbys, then A is regarded as a "false primary" (and
Pgpool-II detaches it from the streaming replication cluster managed
by Pgpool-II if detach_false_primary is enabled).
See Pgpool-II manual "detach_false_primary" section in
http://tatsuo-ishii.github.io/pgpool-II/current/runtime-config-failover.html for more details.
Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp
From: Tsunakawa, Takayuki [mailto:tsunakawa.takay@jp.fujitsu.com]
As for prefer-standby/prefer-read, if host parameter specifies host1,host2
in this order, and host1 is the primary with
default_transaction_read_only=true, does the app get a connection to host1?
I want to connect to host2 (standby) only if host1 is down.
Oops, reverse -- I wanted to say "I want to connect to host1 (primary) only if host2 is down."
Regards
Takayuki Tsunakawa
Tsunakawa, Takayuki wrote:
From: Laurenz Albe [mailto:laurenz.albe@cybertec.at]
I think that transaction_read_only is good.
If it is set to false, we are sure to be on a replication primary or
stand-alone server, which is enough to know for the load balancing use case.As Tatsuo-san said, setting default_transaction_read_only leads to a misjudgement of the primary.
Yes, you can have a false negative, i.e. fail to recognize a primary server.
I deem it unlikely that someone will set default_transaction_read_only to
FALSE and then complain that the feature is not working as expected, but
again
I cannot prove that claim.I wonder what default_transaction_read_only exists for. For maing the database by default
and allowing only specific users to write to the database with "CREATE/ALTER USER SET
default_transaction_read_only = false"?
I'd guess that the main use of default_transaction_read_only is to make sure an
application (that isn't smart enough to change the parameter) won't modify any data.
I'm sorry to repeat myself, but anyway, I think we need a method to connect to a standby
as the original desire, because the primary instance may be read only by default while
only limited users update data. That's for reducing the burdon on the primary and
minimizing the impact on users who update data. For example,* run data reporting on the standby
* backup the database from the standby
* cascade replication from the standby
I see.
But then the new value should not be called "prefer-read", because that would be
misleading. It would also not be related to the existing "read-write".
For what you have in mind, there should be the options "primary-required" and
"standby-preferred", however we implement them.
Have there been a lot of complaints that the existing "read-write" is not good
enough to detect replication primaries?
As Robert said, transaction_read_only might even give you the option to
use the feature for more than just load balancing between replication master
and standby.What use case do you think of? If you want to load balance the workload between
the primary and standbys, we can follow PgJDBC -- targetServerType=any.
One use case I can think of is logical replication (or other replication methods like
Slony). You can use the feature by setting default_transaction_read_only = on
on the standby.
Yours,
Laurenz Albe
On Fri, Jan 18, 2019 at 5:33 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Tsunakawa, Takayuki [mailto:tsunakawa.takay@jp.fujitsu.com]
As for prefer-standby/prefer-read, if host parameter specifies
host1,host2
in this order, and host1 is the primary with
default_transaction_read_only=true, does the app get a connection tohost1?
I want to connect to host2 (standby) only if host1 is down.
Oops, reverse -- I wanted to say "I want to connect to host1 (primary)
only if host2 is down."
Thanks for finding out the problem, how about the following way of checking
for prefer-read/prefer-standby.
1. (default_transaction_read_only = true and recovery = true)
2. If none of the host satisfies the above scenario, then recovery = true
3. Last check is for default_transaction_read_only = true
Regards,
Haribabu Kommi
Fujitsu Australia
From: Laurenz Albe [mailto:laurenz.albe@cybertec.at]
Tsunakawa, Takayuki wrote:
I'm sorry to repeat myself, but anyway, I think we need a method to connect
to a standby
as the original desire, because the primary instance may be read only
by default while
only limited users update data. That's for reducing the burdon on the
primary and
minimizing the impact on users who update data. For example,
* run data reporting on the standby
* backup the database from the standby
* cascade replication from the standbyI see.
But then the new value should not be called "prefer-read", because that
would be
misleading. It would also not be related to the existing "read-write".For what you have in mind, there should be the options "primary-required"
and
"standby-preferred", however we implement them.
Yes, that's what I'm proposing and expecting with a new parameter whose naming follows PgJDBC's.
Have there been a lot of complaints that the existing "read-write" is not
good
enough to detect replication primaries?
I haven't heard anything. I guess almost nobody uses default_transaction_read_only.
Before that, see the description of target_session_attr:
https://www.postgresql.org/docs/devel/libpq-connect.html#LIBPQ-PARAMKEYWORDS
I'm afraid most users don't know whether they can connect to the primary/standby. Just searching "primary", "master" or "standby" in this page doesn't show anything relevant.
One use case I can think of is logical replication (or other replication
methods like
Slony). You can use the feature by setting default_transaction_read_only
= on
on the standby.
I know that, but I suspect that's really a practical use case. Anyway, I'm OK with relying on target_session_attr to fulfill that need.
Regards
Takayuki Tsunakawa
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Thanks for finding out the problem, how about the following way of checking
for prefer-read/prefer-standby.1. (default_transaction_read_only = true and recovery = true)
2. If none of the host satisfies the above scenario, then recovery = true
3. Last check is for default_transaction_read_only = true
That would be fine. But as I mentioned in another mail, I think "get read-only session" and "connect to standby" differ. So I find it better to separate parameters for those request; target_session_attr and target_server_type.
Regards
Takayuki Tsunakawa
On Mon, Jan 21, 2019 at 06:48:14AM +0000, Tsunakawa, Takayuki wrote:
That would be fine. But as I mentioned in another mail, I think
"get read-only session" and "connect to standby" differ. So I find
it better to separate parameters for those request;
target_session_attr and target_server_type.
We've had plenty of discussions about this patch, and nothing really
got out of the crowd. For now I am marking the patch as returned with
feedback as it has been marked as waiting on author for two weeks now.
It may be worth continuing the discussion, still we need to come up
with an agreement first.
--
Michael
On Mon, Jan 21, 2019 at 5:48 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Thanks for finding out the problem, how about the following way of
checking
for prefer-read/prefer-standby.
1. (default_transaction_read_only = true and recovery = true)
2. If none of the host satisfies the above scenario, then recovery = true
3. Last check is for default_transaction_read_only = trueThat would be fine. But as I mentioned in another mail, I think "get
read-only session" and "connect to standby" differ. So I find it better to
separate parameters for those request; target_session_attr and
target_server_type.
Thanks for your opinion.
target_session_attrs checks for the default_transaction_readonly or not?
target_server_type checks for whether the server is in recovery or not?
I feel having two options make this feature complex to use it from the user
point of view?
The need of two options came because of a possibility of a master server
with default_transaction_readonly set to true. Even if the default
transaction
is readonly, it is user changeable parameter, so there shouldn't be any
problem.
The same can be applied for logical replication also, the user can change
the
default transaction mode once the connection is established, if it is not
according
to it's requirement.
how about just adding one parameter that takes the options similar like
JDBC?
target_server_type - Master, standby and prefer-standby. (The option names
can revised based on the common words on the postgresql docs?)
And one more thing, what happens when the server promotes to master but
the connection requested is standby? I feel we can maintain the existing
connections
and later new connections can be redirected? comments?
Regards,
Haribabu Kommi
Fujitsu Australia
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
target_session_attrs checks for the default_transaction_readonly or not?
PG 11 uses transaction_read_only, not default_transaction_readonly. That's fine, because its purpose is to get a read-only session as the name suggests, not to connect to a standby.
target_server_type checks for whether the server is in recovery or not?
Yes.
I feel having two options make this feature complex to use it from the user
point of view?The need of two options came because of a possibility of a master server
with default_transaction_readonly set to true. Even if the default
transaction
is readonly, it is user changeable parameter, so there shouldn't be any
problem.
No. It's not good if the user has to be bothered by default_transaction_read_only when he simply wants to a standby.
how about just adding one parameter that takes the options similar like
JDBC?
target_server_type - Master, standby and prefer-standby. (The option names
can revised based on the common words on the postgresql docs?)
"Getting a read-only session" is not equal to "connecting to a standby", so two different parameters make sense.
And one more thing, what happens when the server promotes to master but
the connection requested is standby? I feel we can maintain the existing
connections
and later new connections can be redirected? comments?
Ideally, it should be possible for the user to choose the behavior like Oracle below. But that's a separate feature.
9.2 Role Transitions Involving Physical Standby Databases
https://docs.oracle.com/en/database/oracle/oracle-database/18/sbydb/managing-oracle-data-guard-role-transitions.html#GUID-857F6F45-DC1C-4345-BD39-F3BE7D79F1CD
--------------------------------------------------
Keeping Physical Standby Sessions Connected During Role Transition
As of Oracle Database 12c Release 2 (12.2.0.1), when a physical standby database is converted into a primary you have the option to keep any sessions connected to the physical standby connected, without disruption, during the switchover/failover.
To enable this feature, set the STANDBY_DB_PRESERVE_STATES initialization parameter in your init.ora file before the standby instance is started. This parameter applies to physical standby databases only. The allowed values are:
NONE — No sessions on the standby are retained during a switchover/failover. This is the default value.
ALL — User sessions are retained during switchover/failover.
SESSION — User sessions are retained during switchover/failover.
--------------------------------------------------
Would you like to work on this patch? I'm not sure if I can take time, but I'm willing to do it if you don't have enough time.
As Tom mentioned, we need to integrate and clean patches in three mail threads:
* Make a new GUC_REPORT parameter, server_type, to show the server role (primary or standby).
* Add target_server_type libpq connection parameter, whose values are either primary, standby, or prefer_standby.
* Failover timeout, load balancing, etc. that someone proposed in the other thread?
(I wonder which of server_type or server_role feels natural in English.)
Or, would you like to share the work, e.g., libpq and server-side?
Regards
Takayuki Tsunakawa
On Fri, Feb 8, 2019 at 8:16 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
target_session_attrs checks for the default_transaction_readonly or not?
PG 11 uses transaction_read_only, not default_transaction_readonly.
That's fine, because its purpose is to get a read-only session as the name
suggests, not to connect to a standby.
Thanks for correction, yes it uses the transaction_readonly.
target_server_type checks for whether the server is in recovery or not?
Yes.
I feel having two options make this feature complex to use it from the
user
point of view?
The need of two options came because of a possibility of a master server
with default_transaction_readonly set to true. Even if the default
transaction
is readonly, it is user changeable parameter, so there shouldn't be any
problem.No. It's not good if the user has to be bothered by
default_transaction_read_only when he simply wants to a standby.
OK. Understood.
so if we are going to differentiate between readonly and standby types,
then I still
feel that adding a prefer-read to target_session_attrs is still valid
improvement.
But the above improvement can be enhanced once the base work of GUC_REPORT
is finished.
how about just adding one parameter that takes the options similar like
JDBC?
target_server_type - Master, standby and prefer-standby. (The optionnames
can revised based on the common words on the postgresql docs?)
"Getting a read-only session" is not equal to "connecting to a standby",
so two different parameters make sense.And one more thing, what happens when the server promotes to master but
the connection requested is standby? I feel we can maintain the existing
connections
and later new connections can be redirected? comments?Ideally, it should be possible for the user to choose the behavior like
Oracle below. But that's a separate feature.9.2 Role Transitions Involving Physical Standby Databases
https://docs.oracle.com/en/database/oracle/oracle-database/18/sbydb/managing-oracle-data-guard-role-transitions.html#GUID-857F6F45-DC1C-4345-BD39-F3BE7D79F1CD
--------------------------------------------------
Keeping Physical Standby Sessions Connected During Role TransitionAs of Oracle Database 12c Release 2 (12.2.0.1), when a physical standby
database is converted into a primary you have the option to keep any
sessions connected to the physical standby connected, without disruption,
during the switchover/failover.To enable this feature, set the STANDBY_DB_PRESERVE_STATES initialization
parameter in your init.ora file before the standby instance is started.
This parameter applies to physical standby databases only. The allowed
values are:NONE — No sessions on the standby are retained during a
switchover/failover. This is the default value.ALL — User sessions are retained during switchover/failover.
SESSION — User sessions are retained during switchover/failover.
--------------------------------------------------
Yes, the above feature is completely a different role enhancement feature,
that can taken up separately.
Would you like to work on this patch? I'm not sure if I can take time,
but I'm willing to do it if you don't have enough time.As Tom mentioned, we need to integrate and clean patches in three mail
threads:* Make a new GUC_REPORT parameter, server_type, to show the server role
(primary or standby).
* Add target_server_type libpq connection parameter, whose values are
either primary, standby, or prefer_standby.
* Failover timeout, load balancing, etc. that someone proposed in the
other thread?
Yes, I want to work on this patch, hopefully by next commitfest. In case if
I didn't get time,
I can ask for your help.
(I wonder which of server_type or server_role feels natural in English.)
server_type may be good as it stands with connection option
(target_server_type).
Regards,
Haribabu Kommi
Fujitsu Australia
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
No. It's not good if the user has to be bothered by
default_transaction_read_only when he simply wants to a standby.OK. Understood.
so if we are going to differentiate between readonly and standby types,
then I still
feel that adding a prefer-read to target_session_attrs is still valid
improvement.
I agree that it's valid improvement to add prefer-read to target_session_attr, as a means to "get a read-only session."
But the above improvement can be enhanced once the base work of GUC_REPORT
is finished.
Is it already in progress in some thread, or are you trying to start from scratch? (I may have done it, but I don't remember it well...)
Yes, I want to work on this patch, hopefully by next commitfest. In case
if I didn't get time,
I can ask for your help.
I'm glad to hear that. Sure. I'd like to review your patch, and possibly add/modify code if necessary. Are you going to add target_server_type={primary | standby | prefer_standby} as well as add prefer-read to target_session_attr?
(I wonder which of server_type or server_role feels natural in
English.)server_type may be good as it stands with connection option
(target_server_type).
Thanks, agreed. That also follows PgJDBC's targetServerType.
Regards
Takayuki Tsunakawa
On Thu, Feb 14, 2019 at 1:04 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
No. It's not good if the user has to be bothered by
default_transaction_read_only when he simply wants to a standby.OK. Understood.
so if we are going to differentiate between readonly and standby types,
then I still
feel that adding a prefer-read to target_session_attrs is still valid
improvement.I agree that it's valid improvement to add prefer-read to
target_session_attr, as a means to "get a read-only session."But the above improvement can be enhanced once the base work of
GUC_REPORT
is finished.
Is it already in progress in some thread, or are you trying to start from
scratch? (I may have done it, but I don't remember it well...)Yes, I want to work on this patch, hopefully by next commitfest. In case
if I didn't get time,
I can ask for your help.I'm glad to hear that. Sure. I'd like to review your patch, and possibly
add/modify code if necessary. Are you going to add
target_server_type={primary | standby | prefer_standby} as well as add
prefer-read to target_session_attr?(I wonder which of server_type or server_role feels natural in
English.)server_type may be good as it stands with connection option
(target_server_type).Thanks, agreed. That also follows PgJDBC's targetServerType.
Here I attached first set of patches that implemented the prefer-read
option after reporting
the transaction_read_only GUC to client. Along the lines of adding
prefer-read option patch,
1. I refactor the existing code to reduce the duplicate.
2. Added a enum to represent the user requested target_session_attrs type,
this is used in
comparison instead of doing a strcmp always.
3. Existing read-write code is modified to use the new reported GUC instead
of executing the
show command.
Basic patches are working, there may still need some documentation works.
Now I will add the another parameter target_server_type to choose the
primary, standby or prefer-standby
as discussed in the upthreads with a new GUC variable.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From 9264fc5bb2188043bcd19d0e549bf1e55f685d38 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Fri, 22 Feb 2019 00:09:02 +1100
Subject: [PATCH 2/4] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 10 ++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index fa157bd2dc..e03d78d3f8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1239,8 +1239,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3229,8 +3232,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE)
{
/*
* Save existing error messages across the PQsendQuery
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 1961161b35..15bb82a885 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -71,6 +71,12 @@ typedef enum
* them. */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4a93d8edbc..43aa6b5f30 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,6 +365,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From 8444117409b1d4326f3f8014aa29069a4493fd1f Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Fri, 22 Feb 2019 00:22:43 +1100
Subject: [PATCH 4/4] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 15 ++-
src/interfaces/libpq/fe-connect.c | 126 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 10 +-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 143 insertions(+), 25 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5c29beef51..d4bddef6cf 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1585,8 +1585,19 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
returns <literal>on</literal>, the connection will be closed.
If multiple hosts were specified in the connection string, any
remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
+ attempt had failed.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, connections
+ where <literal>SHOW transaction_read_only</literal> returns <literal>on</literal>
+ are preferred. If no such connections can be found, then a connection
+ that allows read-write transactions will be accepted.
+ </para>
+
+ <para>
+ The default value of this parameter is <literal>any</literal>,
+ regards all connections as acceptable.
</para>
</listitem>
</varlistentry>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 25a153f48c..6943674067 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1243,6 +1243,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2137,13 +2139,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is already
+ * set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -2347,6 +2367,7 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
+
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
@@ -3225,14 +3246,16 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required,
+ * see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
if (conn->sversion < 120000)
{
@@ -3253,15 +3276,23 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only
+ && (conn->requested_session_type == SESSION_TYPE_READ_WRITE)) ||
+ (!conn->transaction_read_only
+ && (conn->requested_session_type == SESSION_TYPE_PREFER_READ)
+ && (conn->read_write_host_index != -2)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ const char *type = (conn->requested_session_type == SESSION_TYPE_PREFER_READ) ?
+ "read-only" : "writable";
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
@@ -3273,15 +3304,19 @@ keep_going: /* We will come back to here until there is
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
+ libpq_gettext("could not make a %s "
"connection to server "
"\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ type, displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3289,6 +3324,39 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+ else /* obtained the requested type, consume it */
+ {
+ /* We can release the address list now. */
+ release_conn_addrinfo(conn);
+
+ /* We are open for business! */
+ conn->status = CONNECTION_OK;
+ return PGRES_POLLING_OK;
+ }
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if ((conn->target_session_attrs != NULL) &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ) &&
+ (conn->read_write_host_index != -2))
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
/* We can release the address list now. */
@@ -3358,11 +3426,22 @@ keep_going: /* We will come back to here until there is
char *val;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+
+ /*
+ * Server is read-only and requested mode is read-write, ignore it.
+ * Server is read-write and requested mode is prefer-read, record
+ * it for the first time and try to consume in the next scan (it means
+ * no read-only server is found in the first scan).
+ */
+ if (((strncmp(val, "on", 2) == 0) &&
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE)) ||
+ ((strncmp(val, "off", 3) == 0) &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ) &&
+ (conn->read_write_host_index != -2)))
{
- /* Not writable; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
+ /* Not a requested type; fail this connection. */
+ const char *type = (conn->requested_session_type == SESSION_TYPE_PREFER_READ) ?
+ "read-only" : "writable";
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3377,15 +3456,19 @@ keep_going: /* We will come back to here until there is
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
+ libpq_gettext("could not make a %s "
"connection to server "
"\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ type, displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3394,7 +3477,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3608,6 +3691,9 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ /* Initial value */
+ conn->read_write_host_index = -1;
+
return conn;
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 15bb82a885..a7de100c3b 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index b0ac98b90d..1878598633 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,7 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -400,6 +400,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write
+ * host, -2 during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index beb45551a2..0e398136a5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/octet-stream; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From a0d0f95fa4115df3a302a44285234f96067d2dd6 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/4] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the transaction_readonly to find out whether
the server is read-write or not is restuctured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 2 +
2 files changed, 39 insertions(+), 62 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f29202db5f..fa157bd2dc 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3184,6 +3184,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3225,68 +3262,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 97bc98b1f3..1961161b35 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,6 +65,8 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target
+ * connection */
CONNECTION_CONSUME /* Wait for any pending message and consume
* them. */
} ConnStatusType;
--
2.20.1.windows.1
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From 4b2a5db17bfca262d5c7201dab248769a002d4e1 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Fri, 22 Feb 2019 00:12:37 +1100
Subject: [PATCH 3/4] Make transaction_read_only as GUC_REPORT varaible
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 15 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 28 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b2db..5c29beef51 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1581,9 +1581,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
connection in which read-write transactions are accepted by default
is considered acceptable. The query
<literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
+ successful connection if the server is prior to version 12; if it
+ returns <literal>on</literal>, the connection will be closed.
+ If multiple hosts were specified in the connection string, any
+ remaining servers will be tried just as if the connection
attempt had failed. The default value of this parameter,
<literal>any</literal>, regards all connections as acceptable.
</para>
@@ -1951,14 +1952,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d66b860cbd..6b22dfe47a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 156d147c85..cf62cf8699 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1506,7 +1506,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e03d78d3f8..25a153f48c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3234,26 +3234,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3537,6 +3572,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index ac969e7b3f..5adfc36171 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1033,7 +1033,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1087,6 +1087,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp (name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 43aa6b5f30..b0ac98b90d 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -428,6 +428,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* session_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Here I attached first set of patches that implemented the prefer-read option
after reporting the transaction_read_only GUC to client. Along the lines
of adding prefer-read option patch,
Great, thank you! I'll review and test it.
3. Existing read-write code is modified to use the new reported GUC instead
of executing the show command.
Is the code path left to use SHOW for older servers?
Regards
Takayuki Tsunakawa
On Fri, Feb 22, 2019 at 5:47 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Here I attached first set of patches that implemented the prefer-read
optionafter reporting the transaction_read_only GUC to client. Along the lines
of adding prefer-read option patch,Great, thank you! I'll review and test it.
Thanks.
3. Existing read-write code is modified to use the new reported GUC
instead
of executing the show command.
Is the code path left to use SHOW for older servers?
Yes, for older versions less than 12, still uses the SHOW command approach.
Regards,
Haribabu Kommi
Fujitsu Australia
Hi Hari-san,
I've reviewed all files. I think I'll proceed to testing when I've reviewed the revised patch and the patch for target_server_type.
(1) patch 0001
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target
+ * connection */
CONNECTION_CONSUME /* Wait for any pending message and consume
* them. */
According to the following comment, a new enum value should be added at the end.
/*
* Although it is okay to add to these lists, values which become unused
* should never be removed, nor should constants be redefined - that would
* break compatibility with existing code.
*/
(2) patch 0002
It seems to align better with the existing code to set the default value to pg_conn.requested_session_type explicitly in makeEmptyPGconn(), even if the default value is 0. Although I feel it's redundant, other member variables do so.
(3) patch 0003
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
";" is missing at the end of the third line.
(4) patch 0004
- /* Type of connection to make. Possible values: any, read-write. */
+ /* Type of connection to make. Possible values: any, read-write, perfer-read. */
char *target_session_attrs;
perfer -> prefer
(5) patch 0004
@@ -3608,6 +3691,9 @@ makeEmptyPGconn(void)
conn = NULL;
}
+ /* Initial value */
+ conn->read_write_host_index = -1;
The new member should be initialized earlier in this function. Otherwise, as you can see in the above fragment, conn can be NULL in an out-of-memory case.
(6) patch 0004
Don't we add read-only as well as prefer-read, which corresponds to Slave or Secondary of PgJDBC's targetServerType? I thought the original proposal was to add both.
(7) patch 0004
@@ -2347,6 +2367,7 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
+
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
Is this an unintended newline addition?
(8) patch 0004
+ const char *type = (conn->requested_session_type == SESSION_TYPE_PREFER_READ) ?
+ "read-only" : "writable";
I'm afraid these strings are not translatable into other languages.
(9) patch 0004
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
At this point, the session can be either read-write or read-only, can't it? Add the check "!conn->transaction_read_only" in this if condition?
(10) patch 0004
+ if ((conn->target_session_attrs != NULL) &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ) &&
+ (conn->read_write_host_index != -2))
The first condition is not necessary because the second one exists.
The parenthesis surrounding each of these conditions are redundant. It would be better to remove them for readability. This also applies to the following part:
+ if (((strncmp(val, "on", 2) == 0) &&
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE)) ||
+ ((strncmp(val, "off", 3) == 0) &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ) &&
+ (conn->read_write_host_index != -2)))
Regards
Takayuki Tsunakawa
On Mon, Feb 25, 2019 at 11:38 AM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
Hi Hari-san,
I've reviewed all files. I think I'll proceed to testing when I've
reviewed the revised patch and the patch for target_server_type.
Thanks for the review.
(1) patch 0001 CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable * connection. */ + CONNECTION_CHECK_TARGET, /* Check if we have a proper target + * connection */ CONNECTION_CONSUME /* Wait for any pending message and consume * them. */According to the following comment, a new enum value should be added at
the end./*
* Although it is okay to add to these lists, values which become unused
* should never be removed, nor should constants be redefined - that would
* break compatibility with existing code.
*/
fixed.
(2) patch 0002
It seems to align better with the existing code to set the default value
to pg_conn.requested_session_type explicitly in makeEmptyPGconn(), even if
the default value is 0. Although I feel it's redundant, other member
variables do so.
fixed.
(3) patch 0003 <varname>IntervalStyle</varname> was not reported by releases before 8.4; - <varname>application_name</varname> was not reported by releases before 9.0.) + <varname>application_name</varname> was not reported by releases before 9.0 + <varname>transaction_read_only</varname> was not reported by releases before 12.0.)";" is missing at the end of the third line.
fixed.
(4) patch 0004 - /* Type of connection to make. Possible values: any, read-write. */ + /* Type of connection to make. Possible values: any, read-write, perfer-read. */ char *target_session_attrs;perfer -> prefer
fixed.
(5) patch 0004
@@ -3608,6 +3691,9 @@ makeEmptyPGconn(void)
conn = NULL;
}+ /* Initial value */ + conn->read_write_host_index = -1;The new member should be initialized earlier in this function. Otherwise,
as you can see in the above fragment, conn can be NULL in an out-of-memory
case.
fixed.
(6) patch 0004
Don't we add read-only as well as prefer-read, which corresponds to Slave
or Secondary of PgJDBC's targetServerType? I thought the original proposal
was to add both.
Added read-only option.
(7) patch 0004
@@ -2347,6 +2367,7 @@ keep_going:
/* We will come back to here until there isconn->try_next_addr = true;
goto keep_going;
}
+appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not create socket: %s\n"),
Is this an unintended newline addition?
removed.
(8) patch 0004 + const char *type = (conn->requested_session_type == SESSION_TYPE_PREFER_READ) ? + "read-only" : "writable";I'm afraid these strings are not translatable into other languages.
OK. I added two separate error message prepare statements for both
read-write and read-only
so, it shouldn't be a problem.
(9) patch 0004 + /* Record read-write host index */ + if (conn->read_write_host_index == -1) + conn->read_write_host_index = conn->whichhost;At this point, the session can be either read-write or read-only, can't
it? Add the check "!conn->transaction_read_only" in this if condition?
Yes, fixed.
(10) patch 0004 + if ((conn->target_session_attrs != NULL) && + (conn->requested_session_type == SESSION_TYPE_PREFER_READ) && + (conn->read_write_host_index != -2))The first condition is not necessary because the second one exists.
The parenthesis surrounding each of these conditions are redundant. It
would be better to remove them for readability. This also applies to the
following part:+ if (((strncmp(val, "on", 2) == 0) && + (conn->requested_session_type == SESSION_TYPE_READ_WRITE)) || + ((strncmp(val, "off", 3) == 0) && + (conn->requested_session_type == SESSION_TYPE_PREFER_READ) && + (conn->read_write_host_index != -2)))
fixed.
Attached are the updated patches.
The target_server_type option yet to be implemented.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From 60a2b45e081e0916d4a3199e1d93eed2a6a5d71c Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 16:12:24 +1100
Subject: [PATCH 3/5] Make transaction_read_only as GUC_REPORT varaible
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index ac78524924..7b9d219bde 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1594,8 +1594,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, means server doesn't support read-write transactions.
+ But for servers version 12 or greater uses the <varname>transaction_read_only</varname>
+ GUC that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -1961,14 +1963,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d66b860cbd..f3357b5c59 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 156d147c85..cf62cf8699 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1506,7 +1506,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9932dccec5..718a981ed6 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3234,26 +3234,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3534,6 +3569,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index ac969e7b3f..99dd8b1c9c 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1033,7 +1033,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1087,6 +1087,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 43aa6b5f30..b0ac98b90d 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -428,6 +428,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* session_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From fc8596034bb34727eb8198988d69c0a17e795c3c Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/5] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 09e010de7e..9932dccec5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1239,8 +1239,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3229,8 +3232,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3536,6 +3538,8 @@ makeEmptyPGconn(void)
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 50cfe266c6..6961d7ce8e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -70,6 +70,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4a93d8edbc..43aa6b5f30 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,6 +365,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0005-New-read-only-target_session_attrs-type.patchapplication/octet-stream; name=0005-New-read-only-target_session_attrs-type.patchDownload
From 463e566cfaa8d40a0d674fb4caab1b2cad05cad3 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 16:29:29 +1100
Subject: [PATCH 5/5] New read-only target_session_attrs type
With this read-only option type, application can connect to
connecting to a read-only server in the list of hosts, in case
if there is any read-only servers available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 2 +-
src/interfaces/libpq/fe-connect.c | 28 ++++++++++++++++++---------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 14 +++++++++++++-
5 files changed, 36 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7918e398c7..f7331fba32 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1578,7 +1578,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 55bf9f1522..0f89af0787 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1245,6 +1245,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3254,7 +3256,8 @@ keep_going: /* We will come back to here until there is
*/
if (conn->sversion >= 70400 &&
(conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3284,8 +3287,9 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2))
+ ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3319,6 +3323,7 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
@@ -3344,15 +3349,17 @@ keep_going: /* We will come back to here until there is
* Requested type is prefer-read, then record this host index
* and try the other before considering it later
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3443,8 +3450,9 @@ keep_going: /* We will come back to here until there is
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2))
+ ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
PQclear(res);
@@ -3477,7 +3485,9 @@ keep_going: /* We will come back to here until there is
sendTerminateConn(conn);
/* Record read-write host index */
- if (!readonly_server && conn->read_write_host_index == -1)
+ if (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 4b0fc80df2..5d0b885dae 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index d6c482fd0b..8690f86956 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0e398136a5..0d6f279586 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 29;
+use Test::More tests => 32;
# Initialize master node
my $node_master = get_new_node('master');
@@ -129,6 +129,18 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
+# Connection should fail in "read-only" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "read-only", 1);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/octet-stream; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From 47cfcb9c574c1d2d3c33068f1ff3dc09ee281cac Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/5] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the transaction_readonly to find out whether
the server is read-write or not is restuctured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
doc/src/sgml/libpq.sgml | 26 +++++---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
3 files changed, 57 insertions(+), 71 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b2db..ac78524924 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1576,17 +1576,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index c96a52bb1b..09e010de7e 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3184,6 +3184,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3225,68 +3262,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 97bc98b1f3..50cfe266c6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,8 +65,9 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
- CONNECTION_CONSUME /* Wait for any pending message and consume
+ CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From a7cdbcc3681659e036a7db1821a097a07f5e361e Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 16:16:25 +1100
Subject: [PATCH 4/5] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 146 +++++++++++++++++++++-----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 163 insertions(+), 34 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7b9d219bde..7918e398c7 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1577,12 +1577,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1591,6 +1591,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1600,7 +1607,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
GUC that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 718a981ed6..55bf9f1522 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1243,6 +1243,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2137,13 +2139,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is already
+ * set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3225,14 +3245,16 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
if (conn->sversion < 120000)
{
@@ -3253,13 +3275,19 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
@@ -3272,16 +3300,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (!conn->transaction_read_only &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3289,6 +3329,38 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+ else /* obtained the requested type, consume it */
+ {
+ /* We can release the address list now. */
+ release_conn_addrinfo(conn);
+
+ /* We are open for business! */
+ conn->status = CONNECTION_OK;
+ return PGRES_POLLING_OK;
+ }
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
/* We can release the address list now. */
@@ -3356,11 +3428,25 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3373,16 +3459,27 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (!readonly_server && conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3391,7 +3488,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3575,6 +3672,7 @@ makeEmptyPGconn(void)
conn->sock = PGINVALID_SOCKET;
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 6961d7ce8e..4b0fc80df2 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -73,7 +73,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index b0ac98b90d..d6c482fd0b 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -400,6 +403,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index beb45551a2..0e398136a5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Attached are the updated patches.
Thanks, all look fixed.
The target_server_type option yet to be implemented.
Please let me review once more and proceed to testing when the above is added, to make sure the final code looks good. I'd like to see how complex the if conditions in multiple places would be after adding target_server_type, and consider whether we can simplify them together with you. Even now, the if conditions seem complicated to me... that's probably due to the existence of read_write_host_index.
Regards
Takayuki Tsunakawa
Now I will add the another parameter target_server_type to choose the
primary, standby or prefer-standby
as discussed in the upthreads with a new GUC variable.
So just to further confuse things here is a use case for "preferPrimary"
This is from the pgjdbc list.
"if the master instance fails, we would like the driver to communicate with
the secondary instance for read-only operations before the failover process
is commenced. The second use-case is when the master instance is
deliberately shut down for maintenance reasons and we do not want to fail
over to the secondary instance, but instead allow it to process user
queries throughout the maintenance."
see this for the thread.
/messages/by-id/VI1PR05MB5295AE43EF9525EACC9E57ECBC750@VI1PR05MB5295.eurprd05.prod.outlook.com
Dave Cramer
davec@postgresintl.com
www.postgresintl.com
On Thu, Feb 28, 2019 at 1:00 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Attached are the updated patches.
Thanks, all look fixed.
The target_server_type option yet to be implemented.
Please let me review once more and proceed to testing when the above is
added, to make sure the final code looks good. I'd like to see how complex
the if conditions in multiple places would be after adding
target_server_type, and consider whether we can simplify them together with
you. Even now, the if conditions seem complicated to me... that's probably
due to the existence of read_write_host_index.
Yes, if checks are little bit complex because of additional checks to
identify, I will check if there is
any easier way to update them without introducing code duplication.
While working on implementation of target_server_type new connection option
for the libpq
to specify master, slave and etc, there is no problem when the newly added
target_server_type
option is used separate, but when it is combined with the existing
target_session_attrs, there
may be some combinations that are not valid or such servers doesn't exist.
Target_session_attrs Target_server_type
read-write prefer-slave, slave
prefer-read master, slave
read-only master, prefer-slave
I know that some of the cases above is possible, like master server with by
default accepts
read-only sessions. Instead of we put a check to validate what is right
combination, how
about allowing the combinations and in case if such combination is not
possible, means
there shouldn't be any server the supports the requirement, and connection
fails.
comments?
And also as we need to support the new option to connect to servers < 12
also, this option
sends the command "select pg_is_in_recovery()" to the server to find out
whether the server
is recovery mode or not?
And also regarding the implementation point of view, the new
target_server_type option
validation is separately handled, means the check for the required server
is handled in a separate
switch case, when both options are given, first find out the required
server for target_session_attrs
and validate the same again with target_server_type?
Regards,
Haribabu Kommi
Fujitsu Australia
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
Target_session_attrs Target_server_type
read-write prefer-slave, slave
prefer-read master, slave
read-only master, prefer-slaveI know that some of the cases above is possible, like master server with
by default accepts
read-only sessions. Instead of we put a check to validate what is right
combination, how
about allowing the combinations and in case if such combination is not
possible, means
there shouldn't be any server the supports the requirement, and connection
fails.comments?
I think that's OK.
To follow the existing naming, it seems better to use "primary" and "standby" instead of master and slave -- primary_conninfo, synchronous_standby_names, hot_standby, max_standby_streaming_delay and such.
And also as we need to support the new option to connect to servers < 12
also, this option
sends the command "select pg_is_in_recovery()" to the server to find out
whether the server
is recovery mode or not?
The query looks good. OTOH, I think we can return an error when target_server_type is specified against older servers because the parameter is new, if the libpq code would get uglier if we support older servers.
And also regarding the implementation point of view, the new
target_server_type option
validation is separately handled, means the check for the required server
is handled in a separate
switch case, when both options are given, first find out the required server
for target_session_attrs
and validate the same again with target_server_type?
Logically, it seems the order should be reverse; check the server type first, then the session attributes, considering other session attributes in the future.
Regards
Takayuki Tsunakawa
On Mon, Mar 18, 2019 at 9:33 PM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
While working on implementation of target_server_type new connection option for the libpq
to specify master, slave and etc, there is no problem when the newly added target_server_type
option is used separate, but when it is combined with the existing target_session_attrs, there
may be some combinations that are not valid or such servers doesn't exist.Target_session_attrs Target_server_type
read-write prefer-slave, slave
prefer-read master, slave
read-only master, prefer-slaveI know that some of the cases above is possible, like master server with by default accepts
read-only sessions. Instead of we put a check to validate what is right combination, how
about allowing the combinations and in case if such combination is not possible, means
there shouldn't be any server the supports the requirement, and connection fails.comments?
I really dislike having both target_sesion_attrs and
target_server_type. It doesn't solve any actual problem. master,
slave, prefer-save, or whatever you like could be put in
target_session_attrs just as easily, and then we wouldn't end up with
two keywords doing closely related things. 'master' is no more or
less a server attribute than 'read-write'.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: Robert Haas [mailto:robertmhaas@gmail.com]
I really dislike having both target_sesion_attrs and
target_server_type. It doesn't solve any actual problem. master,
slave, prefer-save, or whatever you like could be put in
target_session_attrs just as easily, and then we wouldn't end up with
two keywords doing closely related things. 'master' is no more or
less a server attribute than 'read-write'.
Hmm, that may be OK. At first, I felt it strange to treat the server type (primary or standby) as a session attribute. But we can see the server type as one attribute in a sense that a session is established for. I'm inclined to agree with:
target_session_attr = {any | read-write | read-only | prefer-read | primary | standby | prefer-standby}
Regards
Takayuki Tsunakawa
On Wed, Mar 20, 2019 at 5:01 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
From: Robert Haas [mailto:robertmhaas@gmail.com]
I really dislike having both target_sesion_attrs and
target_server_type. It doesn't solve any actual problem. master,
slave, prefer-save, or whatever you like could be put in
target_session_attrs just as easily, and then we wouldn't end up with
two keywords doing closely related things. 'master' is no more or
less a server attribute than 'read-write'.Hmm, that may be OK. At first, I felt it strange to treat the server type
(primary or standby) as a session attribute. But we can see the server
type as one attribute in a sense that a session is established for. I'm
inclined to agree with:target_session_attr = {any | read-write | read-only | prefer-read |
primary | standby | prefer-standby}
Thanks for your suggestions.
Based on the above new options that can be added to target_session_attrs,
primary - it is just an alias to the read-write option.
standby, prefer-standby - These options should check whether server is
running in recovery mode or not
instead of checking whether server accepts read-only connections or not?
Regards,
Haribabu Kommi
Fujitsu Australia
On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
Based on the above new options that can be added to target_session_attrs,
primary - it is just an alias to the read-write option.
standby, prefer-standby - These options should check whether server is running in recovery mode or not
instead of checking whether server accepts read-only connections or not?
I think it will be best to have one set of attributes that check
default_transaction_read_only and a differently-named set that check
pg_is_in_recovery(). For each, there should be one value that looks
for a 'true' return and one value that looks for a 'false' return and
perhaps values that accept either but prefer one or the other.
IOW, there's no reason to make primary an alias for read-write. If
you want read-write, you can just say read-write. But we can make
'primary' or 'master' look for a server that's not in recovery and
'standby' look for one that is.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Fri, Mar 22, 2019 at 6:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:Based on the above new options that can be added to target_session_attrs,
primary - it is just an alias to the read-write option.
standby, prefer-standby - These options should check whether server isrunning in recovery mode or not
instead of checking whether server accepts read-only connections or not?
I think it will be best to have one set of attributes that check
default_transaction_read_only and a differently-named set that check
pg_is_in_recovery(). For each, there should be one value that looks
for a 'true' return and one value that looks for a 'false' return and
perhaps values that accept either but prefer one or the other.IOW, there's no reason to make primary an alias for read-write. If
you want read-write, you can just say read-write. But we can make
'primary' or 'master' look for a server that's not in recovery and
'standby' look for one that is.
OK, I agree with opinion. I will produce a patch for those new options.
Regards,
Haribabu Kommi
Fujitsu Australia
On Fri, Mar 22, 2019 at 7:32 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Fri, Mar 22, 2019 at 6:57 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:Based on the above new options that can be added to
target_session_attrs,
primary - it is just an alias to the read-write option.
standby, prefer-standby - These options should check whether server isrunning in recovery mode or not
instead of checking whether server accepts read-only connections or not?
I think it will be best to have one set of attributes that check
default_transaction_read_only and a differently-named set that check
pg_is_in_recovery(). For each, there should be one value that looks
for a 'true' return and one value that looks for a 'false' return and
perhaps values that accept either but prefer one or the other.IOW, there's no reason to make primary an alias for read-write. If
you want read-write, you can just say read-write. But we can make
'primary' or 'master' look for a server that's not in recovery and
'standby' look for one that is.OK, I agree with opinion. I will produce a patch for those new options.
Here I attached WIP patch for the new options along with other older
patches.
The basic cases are working fine, doc and tests are missing.
Currently this patch doesn't implement the GUC_REPORT for recovery mode
yet. I am yet to optimize the complex if check.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/octet-stream; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From b60953008e898e0127aa6912686a869d1b352b56 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/6] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the transaction_readonly to find out whether
the server is read-write or not is restuctured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
doc/src/sgml/libpq.sgml | 26 +++++---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
3 files changed, 57 insertions(+), 71 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b2db..ac78524924 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1576,17 +1576,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e3bf6a7449..c68448786d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3188,6 +3188,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3229,68 +3266,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 97bc98b1f3..50cfe266c6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,8 +65,9 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
- CONNECTION_CONSUME /* Wait for any pending message and consume
+ CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.20.1.windows.1
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From 8292bdeb1adf0a0270ed721d330877de0456180d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/6] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index c68448786d..1dc22e6bdf 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1243,8 +1243,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3233,8 +3236,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3540,6 +3542,8 @@ makeEmptyPGconn(void)
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 50cfe266c6..6961d7ce8e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -70,6 +70,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index dbe0f7e5c0..551120660c 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,6 +365,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0005-New-read-only-target_session_attrs-type.patchapplication/octet-stream; name=0005-New-read-only-target_session_attrs-type.patchDownload
From ec766623b19300c12121ce9ca694121fbb862354 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Sun, 17 Mar 2019 02:22:00 +1100
Subject: [PATCH 5/6] New read-only target_session_attrs type
With this read-only option type, application can connect to
connecting to a read-only server in the list of hosts, in case
if there is any read-only servers available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 2 +-
src/interfaces/libpq/fe-connect.c | 27 ++++++++++++++++++---------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++++-
5 files changed, 31 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7918e398c7..f7331fba32 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1578,7 +1578,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8de0d97876..2e16841703 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1249,6 +1249,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3249,7 +3251,7 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
+ * If a read-write, prefer-read or read-only connection is required, see
* if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
@@ -3287,8 +3289,9 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2))
+ ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3322,6 +3325,7 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
@@ -3347,15 +3351,17 @@ keep_going: /* We will come back to here until there is
* Requested type is prefer-read, then record this host index
* and try the other before considering it later
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3446,8 +3452,9 @@ keep_going: /* We will come back to here until there is
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2))
+ ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
PQclear(res);
@@ -3480,7 +3487,9 @@ keep_going: /* We will come back to here until there is
sendTerminateConn(conn);
/* Record read-write host index */
- if (!readonly_server && conn->read_write_host_index == -1)
+ if (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 4b0fc80df2..5d0b885dae 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 13f971b5a5..62a1b7bbd4 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0e398136a5..651107f49b 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 29;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -129,6 +129,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From ed08223d1b9196f77dec63a8d746c21e602d2bb1 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 16:12:24 +1100
Subject: [PATCH 3/6] Make transaction_read_only as GUC_REPORT varaible
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index ac78524924..7b9d219bde 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1594,8 +1594,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, means server doesn't support read-write transactions.
+ But for servers version 12 or greater uses the <varname>transaction_read_only</varname>
+ GUC that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -1961,14 +1963,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d66b860cbd..f3357b5c59 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index aa564d153a..d464be2c5b 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1503,7 +1503,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 1dc22e6bdf..0387d4fd09 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3238,26 +3238,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3538,6 +3573,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 6202653826..d2658efba5 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 551120660c..ccaee68be4 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -430,6 +430,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* session_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From 79fe61d2f9943aceca8ae5c88afe3c7312631b7d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Sun, 17 Mar 2019 02:19:35 +1100
Subject: [PATCH 4/6] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 143 +++++++++++++++++++++-----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 161 insertions(+), 33 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7b9d219bde..7918e398c7 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1577,12 +1577,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1591,6 +1591,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1600,7 +1607,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
GUC that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 0387d4fd09..8de0d97876 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1247,6 +1247,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2141,13 +2143,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is already
+ * set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3229,7 +3249,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3257,13 +3278,19 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
@@ -3276,16 +3303,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (!conn->transaction_read_only &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3293,6 +3332,38 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+ else /* obtained the requested type, consume it */
+ {
+ /* We can release the address list now. */
+ release_conn_addrinfo(conn);
+
+ /* We are open for business! */
+ conn->status = CONNECTION_OK;
+ return PGRES_POLLING_OK;
+ }
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
/* We can release the address list now. */
@@ -3360,11 +3431,25 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3377,16 +3462,27 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (!readonly_server && conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3395,7 +3491,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3579,6 +3675,7 @@ makeEmptyPGconn(void)
conn->sock = PGINVALID_SOCKET;
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 6961d7ce8e..4b0fc80df2 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -73,7 +73,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index ccaee68be4..13f971b5a5 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -400,6 +403,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index beb45551a2..0e398136a5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0006-Primary-prefer-standby-and-standby-options.patchapplication/octet-stream; name=0006-Primary-prefer-standby-and-standby-options.patchDownload
From 30af430274381220895382d2602537f7ae84ce40 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Sun, 17 Mar 2019 02:11:37 +1100
Subject: [PATCH 6/6] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
src/interfaces/libpq/fe-connect.c | 193 ++++++++++++++++++++++++++++--
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 2 +-
3 files changed, 190 insertions(+), 13 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 2e16841703..16a1dd35b3 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -124,6 +124,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -322,7 +323,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1251,6 +1252,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2106,6 +2113,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_NEEDED:
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -3259,7 +3267,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3346,21 +3356,52 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
}
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * Requested type is prefer-read or prefer-standby, then record
+ * this host index and try others before considering it later
*/
- if ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ if (((conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY) &&
conn->read_write_host_index != -2) ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ (conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY))
{
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ if ((conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY) &&
conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
@@ -3450,11 +3491,14 @@ keep_going: /* We will come back to here until there is
* is found in the first scan).
*/
if ((readonly_server &&
- conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PRIMARY)) ||
(!readonly_server &&
- ((conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ (((conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) &&
conn->read_write_host_index != -2) ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
+ (conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY))))
{
/* Not a requested type; fail this connection. */
PQclear(res);
@@ -3542,6 +3586,135 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is primary,
+ * ignore it. Server is not in recovery mode and requested mode is
+ * prefer-standby, record it for the first time and try to
+ * consume in the next scan (it means no standby server
+ * is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (((conn->requested_session_type == SESSION_TYPE_PREFER_READ) &&
+ conn->read_write_host_index != -2) ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (!standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 5d0b885dae..4c1b849019 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
* connection. */
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -75,7 +76,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 62a1b7bbd4..f4e9c1f64b 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
--
2.20.1.windows.1
On Fri, Mar 22, 2019 at 6:07 PM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Fri, Mar 22, 2019 at 7:32 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Fri, Mar 22, 2019 at 6:57 AM Robert Haas <robertmhaas@gmail.com>
wrote:On Thu, Mar 21, 2019 at 2:26 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:Based on the above new options that can be added to
target_session_attrs,
primary - it is just an alias to the read-write option.
standby, prefer-standby - These options should check whether server isrunning in recovery mode or not
instead of checking whether server accepts read-only connections or
not?
I think it will be best to have one set of attributes that check
default_transaction_read_only and a differently-named set that check
pg_is_in_recovery(). For each, there should be one value that looks
for a 'true' return and one value that looks for a 'false' return and
perhaps values that accept either but prefer one or the other.IOW, there's no reason to make primary an alias for read-write. If
you want read-write, you can just say read-write. But we can make
'primary' or 'master' look for a server that's not in recovery and
'standby' look for one that is.OK, I agree with opinion. I will produce a patch for those new options.
Here I attached WIP patch for the new options along with other older
patches.
The basic cases are working fine, doc and tests are missing.Currently this patch doesn't implement the GUC_REPORT for recovery mode
yet. I am yet to optimize the complex if check.
Except in_hotstandby GUC_REPORT, rest of the changes are implemented.
Updated patches are attached.
while going through the old patch where the GUC_REPORT is implemented,
Tom has commented the logic of sending the signal to all backends to process
the hot standby exit with SIGHUP, if we add the logic of updating the GUC
variable value in SIGHUP, we may need to change all the SIGHUP handler
code paths. It is also possible that there is no need to update the
variable for
other processes except backends.
If we go with adding the new SIGUSR1 type to check and update the GUC
varaible
can work for most of the backends and background workers.
opinions
Regards,
Haribabu Kommi
Fujitsu Australia
Note - Attachments order may sometime go wrong.
Attachments:
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/octet-stream; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From 069f073d7d1e33c8a35689eef02666f52be391af Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/7] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the transaction_readonly to find out whether
the server is read-write or not is restuctured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
doc/src/sgml/libpq.sgml | 26 +++++---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
3 files changed, 57 insertions(+), 71 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b2db..ac78524924 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1576,17 +1576,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e3bf6a7449..c68448786d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3188,6 +3188,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3229,68 +3266,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 97bc98b1f3..50cfe266c6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,8 +65,9 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
- CONNECTION_CONSUME /* Wait for any pending message and consume
+ CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.20.1.windows.1
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From 695afa2f9404515a9dcb829782157fdcf7ca408d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/7] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index c68448786d..1dc22e6bdf 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1243,8 +1243,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3233,8 +3236,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3540,6 +3542,8 @@ makeEmptyPGconn(void)
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 50cfe266c6..6961d7ce8e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -70,6 +70,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index dbe0f7e5c0..551120660c 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,6 +365,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From c700c3a7bc4dd033b8a7a8ff2d00e108c80b11a0 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 16:12:24 +1100
Subject: [PATCH 3/7] Make transaction_read_only as GUC_REPORT varaible
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index ac78524924..7b9d219bde 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1594,8 +1594,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, means server doesn't support read-write transactions.
+ But for servers version 12 or greater uses the <varname>transaction_read_only</varname>
+ GUC that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -1961,14 +1963,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d66b860cbd..f3357b5c59 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index aa564d153a..d464be2c5b 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1503,7 +1503,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 1dc22e6bdf..0387d4fd09 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3238,26 +3238,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3538,6 +3573,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 6202653826..d2658efba5 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 551120660c..ccaee68be4 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -430,6 +430,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* session_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
0005-New-read-only-target_session_attrs-type.patchapplication/octet-stream; name=0005-New-read-only-target_session_attrs-type.patchDownload
From 2e5a27473b6c33a0ead839cede2c7e8823d3e047 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 17:35:07 +1100
Subject: [PATCH 5/7] New read-only target_session_attrs type
With this read-only option type, application can connect to
connecting to a read-only server in the list of hosts, in case
if there is any read-only servers available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 7 +++++-
src/interfaces/libpq/fe-connect.c | 34 ++++++++++++++++++++-------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++-
5 files changed, 43 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7918e398c7..79d122d756 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1578,7 +1578,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1606,6 +1606,11 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
But for servers version 12 or greater uses the <varname>transaction_read_only</varname>
GUC that is reported by the server upon successful connection.
</para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 4359a3e152..668bb9712d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1249,6 +1249,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3249,8 +3251,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
- * if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3287,7 +3289,8 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3347,17 +3350,28 @@ keep_going: /* We will come back to here until there is
/*
* Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3445,12 +3459,14 @@ keep_going: /* We will come back to here until there is
* ignore it. Server is read-write and requested mode is
* prefer-read, record it for the first time and try to
* consume in the next scan (it means no read-only server
- * is found in the first scan).
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
*/
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/*
* The following scenario is possible only for the
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 4b0fc80df2..5d0b885dae 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 13f971b5a5..62a1b7bbd4 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0e398136a5..651107f49b 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 29;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -129,6 +129,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From eb62f768d0f89c6f7534b1432b86362f56f573ea Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Sun, 17 Mar 2019 02:19:35 +1100
Subject: [PATCH 4/7] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 161 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 177 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7b9d219bde..7918e398c7 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1577,12 +1577,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1591,6 +1591,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1600,7 +1607,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
GUC that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 0387d4fd09..4359a3e152 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1247,6 +1247,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2141,13 +2143,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3229,7 +3249,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3244,8 +3265,8 @@ keep_going: /* We will come back to here until there is
* Save existing error messages across the PQsendQuery
* attempt. This is necessary because PQsendQuery is
* going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
*/
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3257,16 +3278,30 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3276,16 +3311,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3293,8 +3340,36 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3360,11 +3435,33 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3377,16 +3474,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3395,7 +3504,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3579,6 +3689,7 @@ makeEmptyPGconn(void)
conn->sock = PGINVALID_SOCKET;
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 6961d7ce8e..4b0fc80df2 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -73,7 +73,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index ccaee68be4..13f971b5a5 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -400,6 +403,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index beb45551a2..0e398136a5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0006-Primary-prefer-standby-and-standby-options.patchapplication/octet-stream; name=0006-Primary-prefer-standby-and-standby-options.patchDownload
From cb6e88980096c2090d3d236837c331a6405a2aa3 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 17:47:09 +1100
Subject: [PATCH 6/7] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 236 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 262 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 79d122d756..42dd93bd69 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1578,7 +1578,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1611,6 +1612,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
If this parameter is set to <literal>read-only</literal>, only a connection
in which read-only transactions are accepted by default.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ where the server is not in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, only a connection in which
+ where the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ where the server is in recovery mode.
+ </para>
+
+ <para>
+ To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
+ will be sent upon any successful connection; if it returns <literal>t</literal>, means server
+ is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 668bb9712d..7a75b692c8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -124,6 +124,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -322,7 +323,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1251,6 +1252,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2106,6 +2113,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_NEEDED:
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2145,19 +2153,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means, failed to connect to read-only servers
* and now try connect to read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3259,7 +3267,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3302,7 +3312,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
/* Append error report to conn->errorMessage. */
@@ -3333,8 +3343,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3349,30 +3359,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * severs before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
*/
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
+ */
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3474,7 +3524,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
@@ -3509,8 +3559,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3563,6 +3613,144 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3705,7 +3893,7 @@ makeEmptyPGconn(void)
conn->sock = PGINVALID_SOCKET;
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 5d0b885dae..4c1b849019 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
* connection. */
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -75,7 +76,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 62a1b7bbd4..1591acdc8f 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -409,7 +409,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 651107f49b..f18fef4445 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 31;
+use Test::More tests => 35;
# Initialize master node
my $node_master = get_new_node('master');
@@ -137,6 +137,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0007-New-function-to-rejecting-the-checked-write-connecti.patchapplication/octet-stream; name=0007-New-function-to-rejecting-the-checked-write-connecti.patchDownload
From 185253d4379cc3985634f9344b06c7c8b6f0396b Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 18:11:18 +1100
Subject: [PATCH 7/7] New function to rejecting the checked write connection
When the connection is checked for write or not and based
on the result, if we decide to reject it, call the newly
added function to reject it.
---
src/interfaces/libpq/fe-connect.c | 123 ++++++++++++------------------
1 file changed, 47 insertions(+), 76 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7a75b692c8..82b80385d1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2032,6 +2032,51 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+static void
+reject_checked_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3302,10 +3347,6 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
@@ -3315,42 +3356,7 @@ keep_going: /* We will come back to here until there is
if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
@@ -3531,42 +3537,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
--
2.20.1.windows.1
From: Haribabu Kommi [mailto:kommi.haribabu@gmail.com]
while going through the old patch where the GUC_REPORT is implemented, Tom
has commented the logic of sending the signal to all backends to process
the hot standby exit with SIGHUP, if we add the logic of updating the GUC
variable value in SIGHUP, we may need to change all the SIGHUP handler code
paths. It is also possible that there is no need to update the variable
for other processes except backends.If we go with adding the new SIGUSR1 type to check and update the GUC varaible
can work for most of the backends and background workers.opinions
SIGUSR1 looks reasonable. We can consider it as notifying that the server status has changed.
I've fully reviewed 0001-0003 and my comments follow. I'll review 0004-0007.
(1) 0001
before issuing the transaction_readonly to find out whether
the server is read-write or not is restuctured under a new
transaction_readonly -> "SHOW transaction_read_only"
restuctured -> restructured
(2) 0001
+ succesful connection or failure.
+ successful connection; if it returns <literal>on</literal>, means server
succesful -> successful
means -> it means
(3) 0003
+ But for servers version 12 or greater uses the <varname>transaction_read_only</varname>
+ GUC that is reported by the server upon successful connection.
GUC doesn't seem to be a term to be used in the user manual. Instead:
uses the value of <varname>transaction_read_only</varname> configuration parameter that is...
as in:
https://www.postgresql.org/docs/devel/libpq-connect.html
client_encoding
This sets the client_encoding configuration parameter for this connection.
application_name
Specifies a value for the application_name configuration parameter.
(4) 0003
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* session_read_only */
Looking at the comment for std_strings, it's better to change the comment to transaction_read_only to represent the backing configuration parameter name.
Regards
Takayuki Tsunakawa
I've looked through 0004-0007. I've only found the following:
(5) 0005
With this read-only option type, application can connect to
connecting to a read-only server in the list of hosts, in case
if there is any read-only servers available, the connection
attempt fails.
"connecting to" can be removed.
in case if there is any read-only servers
-> If There's no read only server
Regards
Takayuki Tsunakawa
On Wed, Mar 27, 2019 at 5:17 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
I've looked through 0004-0007. I've only found the following:
(5) 0005
With this read-only option type, application can connect to
connecting to a read-only server in the list of hosts, in case
if there is any read-only servers available, the connection
attempt fails."connecting to" can be removed.
in case if there is any read-only servers
-> If There's no read only server
Thanks for the review.
I corrected all the comments that are raised by you and attached updated
version of patches
along with implementation of WIP in_recovery GUC_REPORT variable. This
patch has used some
of the implementations that are provided earlier in thread [1]/messages/by-id/2239254.dtfY1H9x0t@hammer.magicstack.net.
During the implementation of in_recovery configuration variable, I see a
lot of code addition just
for this variable, is this worth it?
[1]: /messages/by-id/2239254.dtfY1H9x0t@hammer.magicstack.net
/messages/by-id/2239254.dtfY1H9x0t@hammer.magicstack.net
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/x-patch; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From aa812716104b3fbd787dae4483341894c9e5ed9a Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/8] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the SHOW transaction_readonly to find out whether
the server is read-write or not is restructured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
doc/src/sgml/libpq.sgml | 26 +++++---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
3 files changed, 57 insertions(+), 71 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c1d1b6b2db..6221e359f0 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1576,17 +1576,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, it means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e3bf6a7449..c68448786d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3188,6 +3188,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3229,68 +3266,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 97bc98b1f3..50cfe266c6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -65,8 +65,9 @@ typedef enum
CONNECTION_NEEDED, /* Internal state: connect() needed */
CONNECTION_CHECK_WRITABLE, /* Check if we could make a writable
* connection. */
- CONNECTION_CONSUME /* Wait for any pending message and consume
+ CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.20.1.windows.1
0002-New-TargetSessionAttrsType-enum.patchapplication/x-patch; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From ebfa1b1e253aa661261760095d1a75293118161e Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/8] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index c68448786d..1dc22e6bdf 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1243,8 +1243,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3233,8 +3236,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3540,6 +3542,8 @@ makeEmptyPGconn(void)
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 50cfe266c6..6961d7ce8e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -70,6 +70,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index dbe0f7e5c0..551120660c 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,6 +365,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/x-patch; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From 45015c7271300e2a607d8a1211de361e8895f555 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:05:24 +1100
Subject: [PATCH 3/8] Make transaction_read_only as GUC_REPORT varaible
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6221e359f0..fc5eec8450 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1594,8 +1594,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, it means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, it means server doesn't support read-write transactions.
+ But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
+ configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -1961,14 +1963,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d66b860cbd..f3357b5c59 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index aa564d153a..d464be2c5b 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1503,7 +1503,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 1dc22e6bdf..0387d4fd09 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3238,26 +3238,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3538,6 +3573,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 6202653826..d2658efba5 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 551120660c..1154926391 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -430,6 +430,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/x-patch; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From f7d15841528091b266da0ef43b47d1f4da21ed8d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:56:48 +1100
Subject: [PATCH 4/8] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 161 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 177 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index fc5eec8450..51b3e5a6f0 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1577,12 +1577,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1591,6 +1591,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1600,7 +1607,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 0387d4fd09..4359a3e152 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -322,7 +322,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1247,6 +1247,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2141,13 +2143,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3229,7 +3249,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3244,8 +3265,8 @@ keep_going: /* We will come back to here until there is
* Save existing error messages across the PQsendQuery
* attempt. This is necessary because PQsendQuery is
* going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
*/
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3257,16 +3278,30 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3276,16 +3311,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3293,8 +3340,36 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3360,11 +3435,33 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3377,16 +3474,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3395,7 +3504,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3579,6 +3689,7 @@ makeEmptyPGconn(void)
conn->sock = PGINVALID_SOCKET;
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 6961d7ce8e..4b0fc80df2 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -73,7 +73,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1154926391..9fbce9888b 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -363,7 +363,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -400,6 +403,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index beb45551a2..0e398136a5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0005-New-read-only-target_session_attrs-type.patchapplication/x-patch; name=0005-New-read-only-target_session_attrs-type.patchDownload
From 536bf1d55e8c49ac02136503a8eab15c796f2226 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 18:02:59 +1100
Subject: [PATCH 5/8] New read-only target_session_attrs type
With this read-only option type, application can connect to
to a read-only server in the list of hosts, in case
if there is no read-only server available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 7 +++++-
src/interfaces/libpq/fe-connect.c | 34 ++++++++++++++++++++-------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++-
5 files changed, 43 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 51b3e5a6f0..a47081a265 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1578,7 +1578,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1606,6 +1606,11 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
configuration parameter that is reported by the server upon successful connection.
</para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 4359a3e152..668bb9712d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1249,6 +1249,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3249,8 +3251,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
- * if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3287,7 +3289,8 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3347,17 +3350,28 @@ keep_going: /* We will come back to here until there is
/*
* Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3445,12 +3459,14 @@ keep_going: /* We will come back to here until there is
* ignore it. Server is read-write and requested mode is
* prefer-read, record it for the first time and try to
* consume in the next scan (it means no read-only server
- * is found in the first scan).
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
*/
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/*
* The following scenario is possible only for the
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 4b0fc80df2..5d0b885dae 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 9fbce9888b..79ddd4d886 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0e398136a5..651107f49b 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 29;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -129,6 +129,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0006-Primary-prefer-standby-and-standby-options.patchapplication/x-patch; name=0006-Primary-prefer-standby-and-standby-options.patchDownload
From 6802a581641d3082e6a15dbe06c838e64a6a8c2b Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 18:06:46 +1100
Subject: [PATCH 6/8] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 236 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 262 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index a47081a265..328b632e16 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1578,7 +1578,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1611,6 +1612,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
If this parameter is set to <literal>read-only</literal>, only a connection
in which read-only transactions are accepted by default.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ where the server is not in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, only a connection in which
+ where the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ where the server is in recovery mode.
+ </para>
+
+ <para>
+ To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
+ will be sent upon any successful connection; if it returns <literal>t</literal>, means server
+ is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 668bb9712d..7a75b692c8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -124,6 +124,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -322,7 +323,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1251,6 +1252,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2106,6 +2113,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_NEEDED:
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2145,19 +2153,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means, failed to connect to read-only servers
* and now try connect to read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3259,7 +3267,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3302,7 +3312,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
/* Append error report to conn->errorMessage. */
@@ -3333,8 +3343,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3349,30 +3359,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * severs before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
*/
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
+ */
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3474,7 +3524,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
@@ -3509,8 +3559,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3563,6 +3613,144 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3705,7 +3893,7 @@ makeEmptyPGconn(void)
conn->sock = PGINVALID_SOCKET;
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 5d0b885dae..4c1b849019 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
* connection. */
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -75,7 +76,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 79ddd4d886..d60ee385a6 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -409,7 +409,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 651107f49b..f18fef4445 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 31;
+use Test::More tests => 35;
# Initialize master node
my $node_master = get_new_node('master');
@@ -137,6 +137,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0007-New-function-to-rejecting-the-checked-write-connecti.patchapplication/x-patch; name=0007-New-function-to-rejecting-the-checked-write-connecti.patchDownload
From cbe3b9461a6fb4fcde4168f5a45ae0e6af0d900a Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 18:11:18 +1100
Subject: [PATCH 7/8] New function to rejecting the checked write connection
When the connection is checked for write or not and based
on the result, if we decide to reject it, call the newly
added function to reject it.
---
src/interfaces/libpq/fe-connect.c | 123 ++++++++++++------------------
1 file changed, 47 insertions(+), 76 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7a75b692c8..82b80385d1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2032,6 +2032,51 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+static void
+reject_checked_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3302,10 +3347,6 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
@@ -3315,42 +3356,7 @@ keep_going: /* We will come back to here until there is
if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
@@ -3531,42 +3537,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
--
2.20.1.windows.1
0008-Server-recovery-mode-handling.patchapplication/x-patch; name=0008-Server-recovery-mode-handling.patchDownload
From b4c362af111487617f560df54fd0f2971628642d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 28 Mar 2019 15:30:01 +1100
Subject: [PATCH 8/8] Server recovery mode handling
in_recovery GUC_REPORT is added to update the clients when the
server is recovery mode, this is useful for the client connections
to conenct to a standby server with a faster check instead of
executing a command.
New SIGUSR1 handling interrupt is added to support reporting
of recovery mode exit to all backends and their respective
clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
---
doc/src/sgml/libpq.sgml | 14 ++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 9 ++
src/backend/commands/async.c | 53 +++++++++++
src/backend/storage/ipc/procarray.c | 30 ++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 ++
src/backend/tcop/postgres.c | 4 +
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 17 +++-
src/include/commands/async.h | 5 +
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/interfaces/libpq/fe-connect.c | 133 +++++++++++++++++----------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
17 files changed, 237 insertions(+), 59 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 328b632e16..f3bbff9193 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1631,8 +1631,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
- will be sent upon any successful connection; if it returns <literal>t</literal>, means server
- is in recovery mode.
+ will be sent upon any successful connection if the server is prior to version 12; if it returns
+ <literal>t</literal>, it means server is in recovery mode. But for server version 12 or greater
+ uses the value of <varname>in_recovery</varname> configuration parameter that is reported by the
+ server upon successful connection.
</para>
</listitem>
@@ -2000,15 +2002,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by release before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index f3357b5c59..0da02263ea 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELCT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 19d7911ec5..fa51c736fe 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7670,7 +7670,10 @@ StartupXLOG(void)
* RecoverPreparedTransactions(), see notes for lock_twophase_recover()
*/
if (standbyState != STANDBY_DISABLED)
+ {
ShutdownRecoveryTransactionEnvironment();
+ SendRecoveryExitSignal();
+ }
/* Shut down xlogreader */
if (readFile >= 0)
@@ -7879,6 +7882,12 @@ RecoveryInProgress(void)
InitXLOGAccess();
}
+ /* Update in_recovery status. */
+ if (LocalRecoveryInProgress)
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
/*
* Note: We don't need a memory barrier when we're still in recovery.
* We might exit recovery immediately after return, so the caller
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 5a7ee0de4c..934844523c 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -356,6 +356,15 @@ static List *upperPendingNotifies = NIL; /* list of upper-xact lists */
*/
volatile sig_atomic_t notifyInterruptPending = false;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* True if we've registered an on_shmem_exit cleanup */
static bool unlistenExitRegistered = false;
@@ -1736,6 +1745,50 @@ ProcessNotifyInterrupt(void)
ProcessIncomingNotify();
}
+/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
/*
* Read all pending notifications from the queue, and deliver appropriate
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 010cc061c8..a5f3568d09 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2972,6 +2972,36 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
return true; /* timed out, still conflicts */
}
+/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ {
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
/*
* ProcArraySetReplicationSlotXmin
*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c367..e4548dc323 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index cd56dca3ae..65f6ec7ca2 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f9ce3d8f22..d8d13f5a75 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -543,6 +543,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index d228bbed68..1c0f51ac8b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_int \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index d464be2c5b..98795126d0 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -578,7 +578,7 @@ static char *recovery_target_xid_string;
static char *recovery_target_time_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
-
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
char *role_string;
@@ -1728,6 +1728,21 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
diff --git a/src/include/commands/async.h b/src/include/commands/async.h
index cfea78e039..7c0bbd3897 100644
--- a/src/include/commands/async.h
+++ b/src/include/commands/async.h
@@ -24,6 +24,7 @@
extern bool Trace_notify;
extern volatile sig_atomic_t notifyInterruptPending;
+extern volatile sig_atomic_t recoveryExitInterruptPending;
extern Size AsyncShmemSize(void);
extern void AsyncShmemInit(void);
@@ -54,4 +55,8 @@ extern void HandleNotifyInterrupt(void);
/* process interrupts */
extern void ProcessNotifyInterrupt(void);
+/* recovery exit interrupt handling functions */
+extern void HandleRecoveryExitInterrupt(void);
+extern void ProcessRecoveryExitInterrupt(void);
+
#endif /* ASYNC_H */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index bd24850989..296d606e51 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 9f2f965d5c..722357c829 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 2361243514..3bb9023a6b 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 82b80385d1..15ac9866bd 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2077,6 +2077,49 @@ reject_checked_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3374,27 +3417,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
}
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
- conn->status = CONNECTION_CHECK_RECOVERY;
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
+ }
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3643,40 +3711,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index d2658efba5..38b8abcc44 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index d60ee385a6..d135f3d9a3 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -442,6 +442,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
Hi Hari-san,
I've reviewed all the files. The patch would be OK when the following have been fixed, except for the complexity of fe-connect.c (which probably cannot be improved.)
Unfortunately, I'll be absent next week. The earliest date I can do the test will be April 8 or 9. I hope someone could take care of this patch...
(1) 0001
With this read-only option type, application can connect to
to a read-only server in the list of hosts, in case
...
before issuing the SHOW transaction_readonly to find out whether
"to" appears twice in a row.
transaction_readonly -> transaction_read_only
(2) 0001
+ succesful connection or failure.
succesful -> successful
(3) 0008
to conenct to a standby server with a faster check instead of
conenct -> connect
(4) 0008
Logically, recovery exit should be notified after the following statement:
XLogCtl->SharedRecoveryInProgress = false;
(5) 0008
+ /* Update in_recovery status. */
+ if (LocalRecoveryInProgress)
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
This SetConfigOption() is called for every RecoveryInProgress() call on the standby. It may cause undesirable overhead. How about just calling SetConfigOption() once in InitPostgres() to set the initial value for in_recovery? InitPostgres() and its subsidiary functions call SetConfigOption() likewise.
(6) 0008
async.c is for LISTEN/UNLISTEN/NOTIFY. How about adding the new functions in postgres.c like RecoveryConflictInterrupt()?
(7) 0008
+ if (pid != 0)
+ {
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
The braces are not necessary because the block only contains a single statement.
Regards
Takayuki Tsunakawa
On Fri, Mar 29, 2019 at 7:06 PM Tsunakawa, Takayuki <
tsunakawa.takay@jp.fujitsu.com> wrote:
Hi Hari-san,
I've reviewed all the files. The patch would be OK when the following
have been fixed, except for the complexity of fe-connect.c (which probably
cannot be improved.)Unfortunately, I'll be absent next week. The earliest date I can do the
test will be April 8 or 9. I hope someone could take care of this patch...
Thanks for the review. Apologies that I could not able finish it on time
because of
other work.
(1) 0001
With this read-only option type, application can connect to
to a read-only server in the list of hosts, in case
...
before issuing the SHOW transaction_readonly to find out whether"to" appears twice in a row.
transaction_readonly -> transaction_read_only(2) 0001
+ succesful connection or failure.succesful -> successful
(3) 0008
to conenct to a standby server with a faster check instead ofconenct -> connect
(4) 0008
Logically, recovery exit should be notified after the following statement:XLogCtl->SharedRecoveryInProgress = false;
(5) 0008 + /* Update in_recovery status. */ + if (LocalRecoveryInProgress) + SetConfigOption("in_recovery", + "on", + PGC_INTERNAL, PGC_S_OVERRIDE); +This SetConfigOption() is called for every RecoveryInProgress() call on
the standby. It may cause undesirable overhead. How about just calling
SetConfigOption() once in InitPostgres() to set the initial value for
in_recovery? InitPostgres() and its subsidiary functions call
SetConfigOption() likewise.(6) 0008
async.c is for LISTEN/UNLISTEN/NOTIFY. How about adding the new functions
in postgres.c like RecoveryConflictInterrupt()?(7) 0008 + if (pid != 0) + { + (void) SendProcSignal(pid, reason, procvxid.backendId); + }The braces are not necessary because the block only contains a single
statement.
I fixed all the comments that you have raised above and attached the updated
patches.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/octet-stream; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From 940a0da1da7378e7f9bf603a11d55eb984f5c427 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/8] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the SHOW transaction_read_only to find out whether
the server is read-write or not is restructured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
doc/src/sgml/libpq.sgml | 26 +++++---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
3 files changed, 57 insertions(+), 71 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index fe833aa626..67e81e98e1 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1647,17 +1647,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, it means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 87df79880a..9bb2e9631f 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3434,6 +3434,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3475,68 +3512,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index f44030b4b0..39579cd040 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
* connection. */
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
- CONNECTION_GSS_STARTUP /* Negotiating GSSAPI. */
+ CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.20.1.windows.1
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From 512be7c3a908437214926b43f8fc1a189a48cb54 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/8] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9bb2e9631f..24f203230b 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1294,8 +1294,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3479,8 +3482,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3789,6 +3791,8 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 39579cd040..ab415d684a 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -71,6 +71,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1221ea9eef..6ba936dd10 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,6 +366,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From 5b05ff51abfd9368a434db4ba3ab5b6232b480ce Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:05:24 +1100
Subject: [PATCH 3/8] Make transaction_read_only as GUC_REPORT varaible
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 67e81e98e1..12bc0b3fce 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1665,8 +1665,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, it means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, it means server doesn't support read-write transactions.
+ But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
+ configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -2032,14 +2034,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 09893d96b5..d62ccceaa0 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index f7f726b5ae..045b659155 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1545,7 +1545,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 24f203230b..ca5e6d67d3 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3484,26 +3484,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3784,6 +3819,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 6202653826..d2658efba5 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 6ba936dd10..c50877e17a 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -431,6 +431,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From 7c8f2862fc28226c3b91aa2c676b6f7297027701 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:56:48 +1100
Subject: [PATCH 4/8] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 161 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 177 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 12bc0b3fce..0d1e6a756c 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1648,12 +1648,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1662,6 +1662,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1671,7 +1678,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index ca5e6d67d3..a6a5df6fc5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -340,7 +340,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1298,6 +1298,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2233,13 +2235,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3475,7 +3495,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3490,8 +3511,8 @@ keep_going: /* We will come back to here until there is
* Save existing error messages across the PQsendQuery
* attempt. This is necessary because PQsendQuery is
* going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
*/
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3503,16 +3524,30 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3522,16 +3557,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3539,8 +3586,36 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3606,11 +3681,33 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3623,16 +3720,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3641,7 +3750,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3828,6 +3938,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index ab415d684a..a57cfa1f93 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index c50877e17a..69c5d3e349 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -364,7 +364,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -401,6 +404,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index beb45551a2..0e398136a5 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 26;
+use Test::More tests => 29;
# Initialize master node
my $node_master = get_new_node('master');
@@ -117,6 +117,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0005-New-read-only-target_session_attrs-type.patchapplication/octet-stream; name=0005-New-read-only-target_session_attrs-type.patchDownload
From 14c742dca19bb2e666ffd0a9e6bcb4a7a171e3cc Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 18:02:59 +1100
Subject: [PATCH 5/8] New read-only target_session_attrs type
With this read-only option type, application can connect
to a read-only server in the list of hosts, in case
if there is no read-only server available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 7 +++++-
src/interfaces/libpq/fe-connect.c | 34 ++++++++++++++++++++-------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++-
5 files changed, 43 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0d1e6a756c..64ea89a960 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1677,6 +1677,11 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
configuration parameter that is reported by the server upon successful connection.
</para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a6a5df6fc5..4b2bf66fbc 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1300,6 +1300,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3495,8 +3497,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
- * if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3533,7 +3535,8 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3593,17 +3596,28 @@ keep_going: /* We will come back to here until there is
/*
* Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3691,12 +3705,14 @@ keep_going: /* We will come back to here until there is
* ignore it. Server is read-write and requested mode is
* prefer-read, record it for the first time and try to
* consume in the next scan (it means no read-only server
- * is found in the first scan).
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
*/
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/*
* The following scenario is possible only for the
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index a57cfa1f93..76f4222f77 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -75,7 +75,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 69c5d3e349..dbba6935e8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,7 +366,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0e398136a5..651107f49b 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 29;
+use Test::More tests => 31;
# Initialize master node
my $node_master = get_new_node('master');
@@ -129,6 +129,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0006-Primary-prefer-standby-and-standby-options.patchapplication/octet-stream; name=0006-Primary-prefer-standby-and-standby-options.patchDownload
From f667b89d00f6540c6c44169f85ecee73f1a46876 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 10 Apr 2019 23:19:09 +1000
Subject: [PATCH 6/8] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 236 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 262 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 64ea89a960..c8a1ae7afa 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1682,6 +1683,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
If this parameter is set to <literal>read-only</literal>, only a connection
in which read-only transactions are accepted by default.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ where the server is not in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, only a connection in which
+ where the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ where the server is in recovery mode.
+ </para>
+
+ <para>
+ To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
+ will be sent upon any successful connection; if it returns <literal>t</literal>, means server
+ is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 4b2bf66fbc..027fefcf57 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -124,6 +124,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -340,7 +341,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1302,6 +1303,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2198,6 +2205,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2237,19 +2245,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means, failed to connect to read-only servers
* and now try connect to read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3505,7 +3513,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3548,7 +3558,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
/* Append error report to conn->errorMessage. */
@@ -3579,8 +3589,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3595,30 +3605,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * severs before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
*/
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
+ */
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3720,7 +3770,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
@@ -3755,8 +3805,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3809,6 +3859,144 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3954,7 +4142,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 76f4222f77..bc1ab127ea 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -76,7 +77,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index dbba6935e8..e9583bfd95 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,7 +366,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -410,7 +410,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 651107f49b..f18fef4445 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 31;
+use Test::More tests => 35;
# Initialize master node
my $node_master = get_new_node('master');
@@ -137,6 +137,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
note "switching to physical replication slot";
# Switch to using a physical replication slot. We can do this without a new
--
2.20.1.windows.1
0007-New-function-to-rejecting-the-checked-write-connecti.patchapplication/octet-stream; name=0007-New-function-to-rejecting-the-checked-write-connecti.patchDownload
From 0c24f713481021c2c249e83077c88d795e4a42be Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 18:11:18 +1100
Subject: [PATCH 7/8] New function to rejecting the checked write connection
When the connection is checked for write or not and based
on the result, if we decide to reject it, call the newly
added function to reject it.
---
src/interfaces/libpq/fe-connect.c | 123 ++++++++++++------------------
1 file changed, 47 insertions(+), 76 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 027fefcf57..6fd71b24f8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2123,6 +2123,51 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+static void
+reject_checked_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3548,10 +3593,6 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
@@ -3561,42 +3602,7 @@ keep_going: /* We will come back to here until there is
if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
@@ -3777,42 +3783,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
--
2.20.1.windows.1
0008-Server-recovery-mode-handling.patchapplication/octet-stream; name=0008-Server-recovery-mode-handling.patchDownload
From 58cd03e69d53270f6a2f252b68a59b284c996566 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 28 Mar 2019 15:30:01 +1100
Subject: [PATCH 8/8] Server recovery mode handling
in_recovery GUC_REPORT is added to update the clients when the
server is recovery mode, this is useful for the client connections
to connect to a standby server with a faster check instead of
executing a command.
New SIGUSR1 handling interrupt is added to support reporting
of recovery mode exit to all backends and their respective
clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
---
doc/src/sgml/libpq.sgml | 14 ++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/commands/async.c | 1 -
src/backend/storage/ipc/procarray.c | 28 ++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 ++
src/backend/tcop/postgres.c | 60 ++++++++++++
src/backend/utils/init/postinit.c | 6 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 17 +++-
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 133 +++++++++++++++++----------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
18 files changed, 234 insertions(+), 61 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c8a1ae7afa..89a3f4c6e9 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1702,8 +1702,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
- will be sent upon any successful connection; if it returns <literal>t</literal>, means server
- is in recovery mode.
+ will be sent upon any successful connection if the server is prior to version 12; if it returns
+ <literal>t</literal>, it means server is in recovery mode. But for server version 12 or greater
+ uses the value of <varname>in_recovery</varname> configuration parameter that is reported by the
+ server upon successful connection.
</para>
</listitem>
@@ -2071,15 +2073,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by release before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index d62ccceaa0..a3984d6ac4 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELCT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c00b63c751..d2a0b4fd91 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7755,6 +7755,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryInProgress = false;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 5a7ee0de4c..9555da37d2 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -1736,7 +1736,6 @@ ProcessNotifyInterrupt(void)
ProcessIncomingNotify();
}
-
/*
* Read all pending notifications from the queue, and deliver appropriate
* ones to my frontend. Stop when we reach queue head or an uncommitted
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 010cc061c8..75071d6b80 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2972,6 +2972,34 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
return true; /* timed out, still conflicts */
}
+/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
/*
* ProcArraySetReplicationSlotXmin
*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c367..e4548dc323 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 215f1463bb..157a8442df 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..33511a9433 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -167,6 +167,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -195,6 +204,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -543,6 +553,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -2954,6 +2968,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
errno = save_errno;
}
+/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 1c2a99c9c8..1c4599af24 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -649,7 +649,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* This is handled by calling RecoveryInProgress and ignoring the
* result.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index d228bbed68..1c0f51ac8b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_int \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 045b659155..bcf4176c70 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -582,7 +582,7 @@ static char *recovery_target_xid_string;
static char *recovery_target_time_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
-
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
char *role_string;
@@ -1770,6 +1770,21 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index bd24850989..296d606e51 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 9f2f965d5c..722357c829 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 2361243514..3bb9023a6b 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index b367838612..d73ff7ff87 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -71,6 +71,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 6fd71b24f8..2989f3e5d0 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2168,6 +2168,49 @@ reject_checked_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3620,27 +3663,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
}
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
- conn->status = CONNECTION_CHECK_RECOVERY;
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
+ }
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3889,40 +3957,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index d2658efba5..38b8abcc44 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index e9583bfd95..ee2f3da09c 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -443,6 +443,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
On Thu, Apr 11, 2019 at 9:13 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
I fixed all the comments that you have raised above and attached the
updated
patches.
Rebased patches are attached.
Regards,
Haribabu Kommi
Fujitsu Australia
Attachments:
0001-New-pg_basebackup-g-option-to-control-the-group-acce.patchapplication/octet-stream; name=0001-New-pg_basebackup-g-option-to-control-the-group-acce.patchDownload
From 870435b9215901de59b651812f3af5af2bfad93a Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Tue, 12 Feb 2019 17:55:53 +1100
Subject: [PATCH] New pg_basebackup -g option to control the group access
permissions
By default, pg_basebackup follows the backup file permissions
same as source instance, but with this option, user can control the
behavior.
--group-mode = inherit (default) (same permissions are source instance)
--group-mode = none (no group access permissions)
--group-mode = group (group access permissions)
The same applies to the database directory that is already present.
To support group access permissions for the tar mode, the
BASE BACKUP protocol is enhanced to support new option "GROUP_MODE"
with options "none" and "group". These options will be sent to the
server whenever user wants control the group access mode permissions
other than default.
---
doc/src/sgml/protocol.sgml | 13 +++-
doc/src/sgml/ref/pg_basebackup.sgml | 48 ++++++++++++
src/backend/replication/basebackup.c | 46 +++++++++++
src/backend/replication/repl_gram.y | 8 +-
src/backend/replication/repl_scanner.l | 1 +
src/bin/pg_basebackup/pg_basebackup.c | 65 +++++++++++-----
src/bin/pg_basebackup/streamutil.c | 53 +++++++------
src/bin/pg_basebackup/streamutil.h | 11 +++
src/bin/pg_basebackup/t/010_pg_basebackup.pl | 82 +++++++++++++++++++-
9 files changed, 284 insertions(+), 43 deletions(-)
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index b20f1690a7..c3611c53cf 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2466,7 +2466,7 @@ The commands accepted in replication mode are:
</varlistentry>
<varlistentry>
- <term><literal>BASE_BACKUP</literal> [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ] [ <literal>PROGRESS</literal> ] [ <literal>FAST</literal> ] [ <literal>WAL</literal> ] [ <literal>NOWAIT</literal> ] [ <literal>MAX_RATE</literal> <replaceable>rate</replaceable> ] [ <literal>TABLESPACE_MAP</literal> ] [ <literal>NOVERIFY_CHECKSUMS</literal> ]
+ <term><literal>BASE_BACKUP</literal> [ <literal>LABEL</literal> <replaceable>'label'</replaceable> ] [ <literal>PROGRESS</literal> ] [ <literal>FAST</literal> ] [ <literal>WAL</literal> ] [ <literal>NOWAIT</literal> ] [ <literal>MAX_RATE</literal> <replaceable>rate</replaceable> ] [ <literal>TABLESPACE_MAP</literal> ] [ <literal>NOVERIFY_CHECKSUMS</literal> ] [ <literal>GROUP_MODE</literal> <replaceable>'mode'</replaceable> ]
<indexterm><primary>BASE_BACKUP</primary></indexterm>
</term>
<listitem>
@@ -2576,6 +2576,17 @@ The commands accepted in replication mode are:
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>GROUP_MODE</literal> <replaceable>'mode'</replaceable></term>
+ <listitem>
+ <para>
+ By default, the group access permissions will be same as source instance. If <literal>none</literal> is specified,
+ the backup files doesn't contains group access permissions. If <literal>group</literal> is specified, the backup
+ files should contains group access permissions.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
<para>
diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index fc9e222f8d..89cf9895c8 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -536,6 +536,53 @@ PostgreSQL documentation
</para>
</listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><option>-g <replaceable class="parameter">mode</replaceable></option></term>
+ <term><option>--group-mode=<replaceable class="parameter">mode</replaceable></option></term>
+ <listitem>
+ <para>
+ Controls the group permissions of the file in the backup. This option is ignored
+ on <productname>Windows</productname> as it does not support <acronym>POSIX</acronym>-style
+ group permissions. The following methods are available to control the group permissions:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>n</literal></term>
+ <term><literal>none</literal></term>
+ <listitem>
+ <para>
+ Don't include group access permissions in the backup.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>i</literal></term>
+ <term><literal>inherit</literal></term>
+ <listitem>
+ <para>
+ Follow the same access permissions of the source instance.
+ </para>
+ <para>
+ This value is the default.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>g</literal></term>
+ <term><literal>group</literal></term>
+ <listitem>
+ <para>
+ Include group access permissions for all the backup files.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
@@ -742,6 +789,7 @@ PostgreSQL documentation
or an older major version, down to 9.1. However, WAL streaming mode (<literal>-X
stream</literal>) only works with server version 9.3 and later, and tar format mode
(<literal>--format=tar</literal>) of the current version only works with server version 9.5
+ or later. (<literal>-g</literal>) of the current version only works with server version 12
or later.
</para>
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index c2978a949a..713c00d46d 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -54,6 +54,8 @@ typedef struct
bool sendtblspcmapfile;
} basebackup_options;
+static int backup_dir_create_mode = 0;
+static int backup_file_create_mode = 0;
static int64 sendDir(const char *path, int basepathlen, bool sizeonly,
List *tablespaces, bool sendtblspclinks);
@@ -650,6 +652,7 @@ parse_basebackup_options(List *options, basebackup_options *opt)
bool o_maxrate = false;
bool o_tablespace_map = false;
bool o_noverify_checksums = false;
+ bool o_group_mode = false;
MemSet(opt, 0, sizeof(*opt));
foreach(lopt, options)
@@ -738,6 +741,30 @@ parse_basebackup_options(List *options, basebackup_options *opt)
noverify_checksums = true;
o_noverify_checksums = true;
}
+ else if (strcmp(defel->defname, "group_mode") == 0)
+ {
+ if (o_group_mode)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("duplicate option \"%s\"", defel->defname)));
+
+ if (strcmp(strVal(defel->arg), "none") == 0)
+ {
+ backup_dir_create_mode = PG_DIR_MODE_OWNER;
+ backup_file_create_mode = PG_FILE_MODE_OWNER;
+ }
+ else if (strcmp(strVal(defel->arg), "group") == 0)
+ {
+ backup_dir_create_mode = PG_DIR_MODE_GROUP;
+ backup_file_create_mode = PG_FILE_MODE_GROUP;
+ }
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("Invalid option for group_mode \"%s\"", defel->defname)));
+
+ o_group_mode = true;
+ }
else
elog(ERROR, "option \"%s\" not recognized",
defel->defname);
@@ -1602,6 +1629,25 @@ _tarWriteHeader(const char *filename, const char *linktarget,
if (!sizeonly)
{
+ /*
+ * Adjust the mode of the file according to the backup request, ignore
+ * it for tablespace links.
+ */
+ if (!linktarget && backup_file_create_mode)
+ {
+ if (S_ISDIR(statbuf->st_mode))
+ {
+ statbuf->st_mode &= ~(pg_dir_create_mode);
+ statbuf->st_mode |= backup_dir_create_mode;
+ }
+ else
+ {
+ statbuf->st_mode &= ~(pg_file_create_mode);
+ statbuf->st_mode |= backup_file_create_mode;
+ }
+
+ }
+
rc = tarCreateHeader(h, filename, linktarget, statbuf->st_size,
statbuf->st_mode, statbuf->st_uid, statbuf->st_gid,
statbuf->st_mtime);
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index c4e11cc4e8..44059e299b 100644
--- a/src/backend/replication/repl_gram.y
+++ b/src/backend/replication/repl_gram.y
@@ -78,6 +78,7 @@ static SQLCmd *make_sqlcmd(void);
%token K_WAL
%token K_TABLESPACE_MAP
%token K_NOVERIFY_CHECKSUMS
+%token K_GROUP_MODE
%token K_TIMELINE
%token K_PHYSICAL
%token K_LOGICAL
@@ -155,7 +156,7 @@ var_name: IDENT { $$ = $1; }
/*
* BASE_BACKUP [LABEL '<label>'] [PROGRESS] [FAST] [WAL] [NOWAIT]
- * [MAX_RATE %d] [TABLESPACE_MAP] [NOVERIFY_CHECKSUMS]
+ * [MAX_RATE %d] [TABLESPACE_MAP] [NOVERIFY_CHECKSUMS] [GROUP_MODE '<mode>']
*/
base_backup:
K_BASE_BACKUP base_backup_opt_list
@@ -214,6 +215,11 @@ base_backup_opt:
$$ = makeDefElem("noverify_checksums",
(Node *)makeInteger(true), -1);
}
+ | K_GROUP_MODE SCONST
+ {
+ $$ = makeDefElem("group_mode",
+ (Node *)makeString($2), -1);
+ }
;
create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 380faeb5f6..7d1fde67bd 100644
--- a/src/backend/replication/repl_scanner.l
+++ b/src/backend/replication/repl_scanner.l
@@ -93,6 +93,7 @@ MAX_RATE { return K_MAX_RATE; }
WAL { return K_WAL; }
TABLESPACE_MAP { return K_TABLESPACE_MAP; }
NOVERIFY_CHECKSUMS { return K_NOVERIFY_CHECKSUMS; }
+GROUP_MODE { return K_GROUP_MODE; }
TIMELINE { return K_TIMELINE; }
START_REPLICATION { return K_START_REPLICATION; }
CREATE_REPLICATION_SLOT { return K_CREATE_REPLICATION_SLOT; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 9f0bede93b..a7d19d1072 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -350,6 +350,8 @@ usage(void)
printf(_(" -N, --no-sync do not wait for changes to be written safely to disk\n"));
printf(_(" -P, --progress show progress information\n"));
printf(_(" -S, --slot=SLOTNAME replication slot to use\n"));
+ printf(_(" -g, --group-mode=inherit|group|none\n"
+ " specify required group access mode for basebackup directory\n"));
printf(_(" -v, --verbose output verbose messages\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" --no-slot prevent creation of temporary replication slot\n"));
@@ -676,6 +678,12 @@ verify_dir_is_empty_or_create(char *dirname, bool *created, bool *found)
/*
* Exists, empty
*/
+ if (chmod(dirname, pg_dir_create_mode) != 0)
+ {
+ fprintf(stderr, _("%s: could not change permissions of directory \"%s\": %s\n"),
+ progname, dirname, strerror(errno));
+ exit(1);
+ }
if (found)
*found = true;
return;
@@ -1445,7 +1453,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
if (file == NULL)
{
- int filemode;
+ mode_t oumask;
/*
* No current file, so this must be the header for a new file
@@ -1459,9 +1467,6 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
current_len_left = read_tar_number(©buf[124], 12);
- /* Set permissions on the file */
- filemode = read_tar_number(©buf[100], 8);
-
/*
* All files are padded up to 512 bytes
*/
@@ -1505,11 +1510,6 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
exit(1);
}
}
-#ifndef WIN32
- if (chmod(filename, (mode_t) filemode))
- pg_log_error("could not set permissions on directory \"%s\": %m",
- filename);
-#endif
}
else if (copybuf[156] == '2')
{
@@ -1547,19 +1547,15 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
/*
* regular file
*/
+ oumask = umask(pg_mode_mask);
file = fopen(filename, "wb");
+ umask(oumask);
if (!file)
{
pg_log_error("could not create file \"%s\": %m", filename);
exit(1);
}
-#ifndef WIN32
- if (chmod(filename, (mode_t) filemode))
- pg_log_error("could not set permissions on file \"%s\": %m",
- filename);
-#endif
-
if (current_len_left == 0)
{
/*
@@ -1795,6 +1791,7 @@ BaseBackup(void)
char *basebkp;
char escaped_label[MAXPGPATH];
char *maxrate_clause = NULL;
+ char *group_access_mode_clause = NULL;
int i;
char xlogstart[64];
char xlogend[64];
@@ -1868,8 +1865,14 @@ BaseBackup(void)
fprintf(stderr, "\n");
}
+ /* Request server to send the file permissions according to the request */
+ if (group_access_mode == GROUP_ACCESS_NONE)
+ group_access_mode_clause = psprintf("GROUP_MODE '%s'", "none");
+ else if (group_access_mode == GROUP_ACCESS_PROVIDE)
+ group_access_mode_clause = psprintf("GROUP_MODE '%s'", "group");
+
basebkp =
- psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s",
+ psprintf("BASE_BACKUP LABEL '%s' %s %s %s %s %s %s %s %s",
escaped_label,
showprogress ? "PROGRESS" : "",
includewal == FETCH_WAL ? "WAL" : "",
@@ -1877,7 +1880,8 @@ BaseBackup(void)
includewal == NO_WAL ? "" : "NOWAIT",
maxrate_clause ? maxrate_clause : "",
format == 't' ? "TABLESPACE_MAP" : "",
- verify_checksums ? "" : "NOVERIFY_CHECKSUMS");
+ verify_checksums ? "" : "NOVERIFY_CHECKSUMS",
+ group_access_mode_clause ? group_access_mode_clause : "");
if (PQsendQuery(conn, basebkp) == 0)
{
@@ -2193,6 +2197,7 @@ main(int argc, char **argv)
{"status-interval", required_argument, NULL, 's'},
{"verbose", no_argument, NULL, 'v'},
{"progress", no_argument, NULL, 'P'},
+ {"group-mode", required_argument, NULL, 'g'},
{"waldir", required_argument, NULL, 1},
{"no-slot", no_argument, NULL, 2},
{"no-verify-checksums", no_argument, NULL, 3},
@@ -2223,7 +2228,7 @@ main(int argc, char **argv)
atexit(cleanup_directories_atexit);
- while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP",
+ while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:g:h:p:U:s:wWkvP",
long_options, &option_index)) != -1)
{
switch (c)
@@ -2361,6 +2366,30 @@ main(int argc, char **argv)
case 'P':
showprogress = true;
break;
+ case 'g':
+ if (strcmp(optarg, "i") == 0 ||
+ strcmp(optarg, "inherit") == 0)
+ {
+ group_access_mode = GROUP_ACCESS_INHERIT;
+ }
+ else if (strcmp(optarg, "g") == 0 ||
+ strcmp(optarg, "group") == 0)
+ {
+ group_access_mode = GROUP_ACCESS_PROVIDE;
+ }
+ else if (strcmp(optarg, "n") == 0 ||
+ strcmp(optarg, "none") == 0)
+ {
+ group_access_mode = GROUP_ACCESS_NONE;
+ }
+ else
+ {
+ fprintf(stderr,
+ _("%s: invalid group-mode option \"%s\", must be \"inherit\", \"group\", or \"none\"\n"),
+ progname, optarg);
+ exit(1);
+ }
+ break;
case 3:
verify_checksums = false;
break;
diff --git a/src/bin/pg_basebackup/streamutil.c b/src/bin/pg_basebackup/streamutil.c
index 79f17e4089..66bd8cd877 100644
--- a/src/bin/pg_basebackup/streamutil.c
+++ b/src/bin/pg_basebackup/streamutil.c
@@ -33,6 +33,7 @@
#define ERRCODE_DUPLICATE_OBJECT "42710"
uint32 WalSegSz;
+GroupAccessMode group_access_mode = GROUP_ACCESS_INHERIT;
static bool RetrieveDataDirCreatePerm(PGconn *conn);
@@ -364,36 +365,44 @@ RetrieveDataDirCreatePerm(PGconn *conn)
if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_GROUP_ACCESS)
return true;
- res = PQexec(conn, "SHOW data_directory_mode");
- if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ if (group_access_mode == GROUP_ACCESS_INHERIT)
{
- pg_log_error("could not send replication command \"%s\": %s",
- "SHOW data_directory_mode", PQerrorMessage(conn));
+ res = PQexec(conn, "SHOW data_directory_mode");
+ if (PQresultStatus(res) != PGRES_TUPLES_OK)
+ {
+ pg_log_error("could not send replication command \"%s\": %s",
+ "SHOW data_directory_mode", PQerrorMessage(conn));
- PQclear(res);
- return false;
- }
- if (PQntuples(res) != 1 || PQnfields(res) < 1)
- {
- pg_log_error("could not fetch group access flag: got %d rows and %d fields, expected %d rows and %d or more fields",
- PQntuples(res), PQnfields(res), 1, 1);
+ PQclear(res);
+ return false;
+ }
+ if (PQntuples(res) != 1 || PQnfields(res) < 1)
+ {
+ pg_log_error("could not fetch group access flag: got %d rows and %d fields, expected %d rows and %d or more fields",
+ PQntuples(res), PQnfields(res), 1, 1);
- PQclear(res);
- return false;
- }
+ PQclear(res);
+ return false;
+ }
- if (sscanf(PQgetvalue(res, 0, 0), "%o", &data_directory_mode) != 1)
- {
- pg_log_error("group access flag could not be parsed: %s",
- PQgetvalue(res, 0, 0));
+ if (sscanf(PQgetvalue(res, 0, 0), "%o", &data_directory_mode) != 1)
+ {
+ pg_log_error("group access flag could not be parsed: %s",
+ PQgetvalue(res, 0, 0));
+
+ PQclear(res);
+ return false;
+ }
+
+ SetDataDirectoryCreatePerm(data_directory_mode);
PQclear(res);
- return false;
}
+ else if (group_access_mode == GROUP_ACCESS_NONE)
+ SetDataDirectoryCreatePerm(PG_DIR_MODE_OWNER);
+ else
+ SetDataDirectoryCreatePerm(PG_DIR_MODE_GROUP);
- SetDataDirectoryCreatePerm(data_directory_mode);
-
- PQclear(res);
return true;
}
diff --git a/src/bin/pg_basebackup/streamutil.h b/src/bin/pg_basebackup/streamutil.h
index a756eee262..7438cc5062 100644
--- a/src/bin/pg_basebackup/streamutil.h
+++ b/src/bin/pg_basebackup/streamutil.h
@@ -17,6 +17,16 @@
#include "access/xlogdefs.h"
#include "datatype/timestamp.h"
+/*
+ * Different ways to specify group access mode
+ */
+typedef enum
+{
+ GROUP_ACCESS_INHERIT = 0,
+ GROUP_ACCESS_PROVIDE,
+ GROUP_ACCESS_NONE
+} GroupAccessMode;
+
extern const char *progname;
extern char *connection_string;
extern char *dbhost;
@@ -25,6 +35,7 @@ extern char *dbport;
extern char *dbname;
extern int dbgetpassword;
extern uint32 WalSegSz;
+extern GroupAccessMode group_access_mode;
/* Connection kept global so we can disconnect easily */
extern PGconn *conn;
diff --git a/src/bin/pg_basebackup/t/010_pg_basebackup.pl b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
index b7d36b65dd..f5dc451f86 100644
--- a/src/bin/pg_basebackup/t/010_pg_basebackup.pl
+++ b/src/bin/pg_basebackup/t/010_pg_basebackup.pl
@@ -6,7 +6,7 @@ use File::Basename qw(basename dirname);
use File::Path qw(rmtree);
use PostgresNode;
use TestLib;
-use Test::More tests => 106;
+use Test::More tests => 118;
program_help_ok('pg_basebackup');
program_version_ok('pg_basebackup');
@@ -402,6 +402,86 @@ $node->command_ok(
'pg_basebackup -X stream runs with --no-slot');
rmtree("$tempdir/backupnoslot");
+# The following tests test backup unix file permissions. Windows doesn't support
+# unix file permissions, so skip on Windows.
+SKIP:
+{
+ skip "unix style file permissions are not supported on Windows", 18 if ($windows_os);
+
+ $node->stop;
+
+ # Set umask so test directories and files are created with default permissions
+ umask(0077);
+
+ # Enable no group permissions on PGDATA
+ chmod_recursive("$pgdata", 0700, 0600);
+
+ $node->start;
+
+ $node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backupP", '-g', 'group' ],
+ 'pg_basebackup -g group runs');
+
+ # Group access should be enabled on all backup files
+ ok(check_mode_recursive("$tempdir/backupP", 0750, 0640),
+ "check backup dir permissions");
+ rmtree("$tempdir/backupP");
+
+ $node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backupP", '-g', 'inherit' ],
+ 'pg_basebackup -g inherit runs');
+
+ # Group access should be not enabled on all backup files
+ ok(check_mode_recursive("$tempdir/backupP", 0700, 0600),
+ "check backup dir permissions");
+ rmtree("$tempdir/backupP");
+
+ $node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backupP", '-g', 'group', '-Ft' ],
+ 'pg_basebackup -g group in tar mode runs');
+
+ system_or_bail 'tar', '-xvf', "$tempdir/backupP/base.tar";
+
+ # Group access should be enabled on all backup files
+ ok(check_mode_recursive("$tempdir/backupP", 0750, 0640),
+ "check backup dir permissions");
+ rmtree("$tempdir/backupP");
+
+
+ $node->stop;
+
+ # Set umask so test directories and files are created with group permissions
+ umask(0027);
+
+ # Enable no group permissions on PGDATA
+ chmod_recursive("$pgdata", 0750, 0640);
+
+ $node->start;
+
+ $node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backupP", ],
+ 'pg_basebackup default runs');
+
+ # Group access should be enabled on all backup files
+ ok(check_mode_recursive("$tempdir/backupP", 0750, 0640),
+ "check backup dir permissions");
+ rmtree("$tempdir/backupP");
+
+ $node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backupP", '-g', 'none'],
+ 'pg_basebackup -g none runs');
+
+ # Group access should be not enabled on all backup files
+ ok(check_mode_recursive("$tempdir/backupP", 0700, 0600),
+ "check backup dir permissions");
+ rmtree("$tempdir/backupP");
+
+ $node->command_ok([ 'pg_basebackup', '-D', "$tempdir/backupP", '-g', 'none', '-Ft' ],
+ 'pg_basebackup -g none in tar mode runs');
+
+ system_or_bail 'tar', '-xvf', "$tempdir/backupP/base.tar";
+
+ # Group access should be not enabled on all backup files
+ ok(check_mode_recursive("$tempdir/backupP", 0700, 0600),
+ "check backup dir permissions");
+ rmtree("$tempdir/backupP");
+}
+
$node->command_fails(
[
'pg_basebackup', '-D',
--
2.20.1.windows.1
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From 22d8f6c851cd7e55838ad4bcf6937b55b7144b8a Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/8] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 44055a5682..ce690b052c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1294,8 +1294,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3480,8 +3483,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3790,6 +3792,8 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index ef21864abf..b76022a96e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -71,6 +71,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index c0b8e3f8ce..4f0d20a696 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,6 +366,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.20.1.windows.1
0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-varaible.patchDownload
From b9b1f02096412c8d2c3c4c23d71f4d98b46e0737 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:05:24 +1100
Subject: [PATCH 3/8] Make transaction_read_only as GUC_REPORT variable
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 17310eb077..6107755416 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1665,8 +1665,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, it means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, it means server doesn't support read-write transactions.
+ But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
+ configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -2032,14 +2034,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index b20f1690a7..05286c7365 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELECT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1208eb9a68..e86b4d3319 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1545,7 +1545,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index ce690b052c..8bdabdf3b5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3485,26 +3485,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3785,6 +3820,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 3a8cddf4ff..701c4a66fe 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4f0d20a696..caddf0ddc5 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -431,6 +431,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From afe1430861a0d7a6d76c4511c14aaa9e47a2f095 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:56:48 +1100
Subject: [PATCH 4/8] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 161 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 177 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6107755416..67f2c6a5c1 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1648,12 +1648,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1662,6 +1662,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1671,7 +1678,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8bdabdf3b5..df2c190a3f 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -340,7 +340,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1298,6 +1298,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2233,13 +2235,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3476,7 +3496,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3491,8 +3512,8 @@ keep_going: /* We will come back to here until there is
* Save existing error messages across the PQsendQuery
* attempt. This is necessary because PQsendQuery is
* going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
*/
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3504,16 +3525,30 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3523,16 +3558,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3540,8 +3587,36 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3607,11 +3682,33 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3624,16 +3721,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3642,7 +3751,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3829,6 +3939,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index b76022a96e..f3833678e2 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index caddf0ddc5..4e8bab3822 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -364,7 +364,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -401,6 +404,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 3c743d7d7c..af465be505 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 32;
+use Test::More tests => 35;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.20.1.windows.1
0005-New-read-only-target_session_attrs-type.patchapplication/octet-stream; name=0005-New-read-only-target_session_attrs-type.patchDownload
From 0410c3a0740d5ed68ea9d107393546d3d67d9541 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 18:02:59 +1100
Subject: [PATCH 5/8] New read-only target_session_attrs type
With this read-only option type, application can connect
to a read-only server in the list of hosts, in case
if there is no read-only server available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 7 +++++-
src/interfaces/libpq/fe-connect.c | 34 ++++++++++++++++++++-------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++-
5 files changed, 43 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 67f2c6a5c1..2be51b2a49 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1677,6 +1677,11 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
configuration parameter that is reported by the server upon successful connection.
</para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index df2c190a3f..38cbb07828 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1300,6 +1300,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3496,8 +3498,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
- * if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3534,7 +3536,8 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3594,17 +3597,28 @@ keep_going: /* We will come back to here until there is
/*
* Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3692,12 +3706,14 @@ keep_going: /* We will come back to here until there is
* ignore it. Server is read-write and requested mode is
* prefer-read, record it for the first time and try to
* consume in the next scan (it means no read-only server
- * is found in the first scan).
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
*/
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/*
* The following scenario is possible only for the
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index f3833678e2..a015579c43 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -75,7 +75,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4e8bab3822..a00d8bddeb 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,7 +366,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index af465be505..ac1e11e1ab 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 35;
+use Test::More tests => 37;
# Initialize master node
my $node_master = get_new_node('master');
@@ -133,6 +133,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.20.1.windows.1
0006-Primary-prefer-standby-and-standby-options.patchapplication/octet-stream; name=0006-Primary-prefer-standby-and-standby-options.patchDownload
From b816e48049cb41c9390761e5cb2546447ada38a2 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 10 Apr 2019 23:19:09 +1000
Subject: [PATCH 6/8] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 236 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 262 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 2be51b2a49..6b84accd08 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1682,6 +1683,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
If this parameter is set to <literal>read-only</literal>, only a connection
in which read-only transactions are accepted by default.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ where the server is not in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, only a connection in which
+ where the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ where the server is in recovery mode.
+ </para>
+
+ <para>
+ To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
+ will be sent upon any successful connection; if it returns <literal>t</literal>, means server
+ is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 38cbb07828..a6e0959895 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -124,6 +124,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -340,7 +341,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1302,6 +1303,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2198,6 +2205,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2237,19 +2245,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means, failed to connect to read-only servers
* and now try connect to read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3506,7 +3514,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3549,7 +3559,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
/* Append error report to conn->errorMessage. */
@@ -3580,8 +3590,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3596,30 +3606,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * severs before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
*/
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
+ */
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3721,7 +3771,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
@@ -3756,8 +3806,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3810,6 +3860,144 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3955,7 +4143,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index a015579c43..6c5780c672 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -76,7 +77,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index a00d8bddeb..740033e116 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,7 +366,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -410,7 +410,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ac1e11e1ab..8fa28dab23 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 37;
+use Test::More tests => 41;
# Initialize master node
my $node_master = get_new_node('master');
@@ -141,6 +141,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.20.1.windows.1
0007-New-function-to-rejecting-the-checked-write-connecti.patchapplication/octet-stream; name=0007-New-function-to-rejecting-the-checked-write-connecti.patchDownload
From 65e9a74dfe5f8fb4bf84cd7df8da93d32662a282 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 18:11:18 +1100
Subject: [PATCH 7/8] New function to rejecting the checked write connection
When the connection is checked for write or not and based
on the result, if we decide to reject it, call the newly
added function to reject it.
---
src/interfaces/libpq/fe-connect.c | 123 ++++++++++++------------------
1 file changed, 47 insertions(+), 76 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index a6e0959895..94a537927a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2123,6 +2123,51 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+static void
+reject_checked_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3549,10 +3594,6 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
@@ -3562,42 +3603,7 @@ keep_going: /* We will come back to here until there is
if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
@@ -3778,42 +3784,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
--
2.20.1.windows.1
0008-Server-recovery-mode-handling.patchapplication/octet-stream; name=0008-Server-recovery-mode-handling.patchDownload
From 05da6b0dcf5a9255ee62a0bac667f3a13366502d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 28 Mar 2019 15:30:01 +1100
Subject: [PATCH 8/8] Server recovery mode handling
in_recovery GUC_REPORT is added to update the clients when the
server is recovery mode, this is useful for the client connections
to connect to a standby server with a faster check instead of
executing a command.
New SIGUSR1 handling interrupt is added to support reporting
of recovery mode exit to all backends and their respective
clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
---
doc/src/sgml/libpq.sgml | 14 ++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/commands/async.c | 1 -
src/backend/storage/ipc/procarray.c | 28 ++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 ++
src/backend/tcop/postgres.c | 60 ++++++++++++
src/backend/utils/init/postinit.c | 6 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 17 +++-
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 133 +++++++++++++++++----------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
18 files changed, 234 insertions(+), 61 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6b84accd08..90e8c96820 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1702,8 +1702,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
- will be sent upon any successful connection; if it returns <literal>t</literal>, means server
- is in recovery mode.
+ will be sent upon any successful connection if the server is prior to version 12; if it returns
+ <literal>t</literal>, it means server is in recovery mode. But for server version 12 or greater
+ uses the value of <varname>in_recovery</varname> configuration parameter that is reported by the
+ server upon successful connection.
</para>
</listitem>
@@ -2071,15 +2073,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by release before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 05286c7365..9ed2a03b73 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELECT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1c7dd51b9f..976e6f63ae 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7755,6 +7755,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryInProgress = false;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 738e6ec7e2..102094e765 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -1736,7 +1736,6 @@ ProcessNotifyInterrupt(void)
ProcessIncomingNotify();
}
-
/*
* Read all pending notifications from the queue, and deliver appropriate
* ones to my frontend. Stop when we reach queue head or an uncommitted
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 18a0f62ba6..ad72ab10c7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2972,6 +2972,34 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
return true; /* timed out, still conflicts */
}
+/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
/*
* ProcArraySetReplicationSlotXmin
*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c367..e4548dc323 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 842fcabd97..831cd80357 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..33511a9433 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -167,6 +167,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -195,6 +204,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -543,6 +553,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -2954,6 +2968,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
errno = save_errno;
}
+/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index e9f72b5069..29a159dc11 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -651,7 +651,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* This is handled by calling RecoveryInProgress and ignoring the
* result.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index d228bbed68..1c0f51ac8b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_int \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index e86b4d3319..23b5b3e46c 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -582,7 +582,7 @@ static char *recovery_target_xid_string;
static char *recovery_target_time_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
-
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
char *role_string;
@@ -1771,6 +1771,21 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index da8b672096..86f0c13134 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 05b186a05c..9cf9560b06 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index a3f8f82ff3..2c73f0c0a8 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 8dcfb40728..7b7bfe0209 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -71,6 +71,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 94a537927a..6f13d099e2 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2168,6 +2168,49 @@ reject_checked_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3621,27 +3664,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
}
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
- conn->status = CONNECTION_CHECK_RECOVERY;
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
+ }
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3890,40 +3958,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 701c4a66fe..7cffed57cd 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 740033e116..7b36ddb804 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -443,6 +443,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.20.1.windows.1
On Mon, 3 Jun 2019 at 16:32, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Thu, Apr 11, 2019 at 9:13 AM Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:I fixed all the comments that you have raised above and attached the
updated
patches.Rebased patches are attached.
Rebased patches are attached.
Regards,
Haribabu Kommi
Attachments:
0001-Restructure-the-code-to-remove-duplicate-code.patchapplication/octet-stream; name=0001-Restructure-the-code-to-remove-duplicate-code.patchDownload
From 724e98a9633c949665867eef43e701d6e5bdadf8 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 21 Feb 2019 23:11:55 +1100
Subject: [PATCH 1/8] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the SHOW transaction_read_only to find out whether
the server is read-write or not is restructured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
---
doc/src/sgml/libpq.sgml | 26 +++++---
src/interfaces/libpq/fe-connect.c | 99 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
3 files changed, 57 insertions(+), 71 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 7f01fcc148..17310eb077 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1647,17 +1647,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, it means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index c800d7921e..6e70f4abd7 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3433,6 +3433,43 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_SETENV:
+ {
+
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
+ conn->status = CONNECTION_OK;
+
+ switch (pqSetenvPoll(conn))
+ {
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
+ }
+ }
+
+ /* Intentional fall through */
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3474,68 +3511,6 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_OK;
}
- case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
- {
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
- }
-
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 26198fc1de..ef21864abf 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
* connection. */
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
- CONNECTION_GSS_STARTUP /* Negotiating GSSAPI. */
+ CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.21.0
0002-New-TargetSessionAttrsType-enum.patchapplication/octet-stream; name=0002-New-TargetSessionAttrsType-enum.patchDownload
From 39bc86ec043aa043d070c0d4a8be86176fd50584 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH 2/8] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 6e70f4abd7..8b2d46d31c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1294,8 +1294,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3478,8 +3481,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3788,6 +3790,8 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index ef21864abf..b76022a96e 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -71,6 +71,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index c0b8e3f8ce..4f0d20a696 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,6 +366,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.21.0
0003-Make-transaction_read_only-as-GUC_REPORT-variable.patchapplication/octet-stream; name=0003-Make-transaction_read_only-as-GUC_REPORT-variable.patchDownload
From 1eaf6bd08575935e223569e1c4fb8dcf19439246 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:05:24 +1100
Subject: [PATCH 3/8] Make transaction_read_only as GUC_REPORT variable
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 17310eb077..6107755416 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1665,8 +1665,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, it means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, it means server doesn't support read-write transactions.
+ But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
+ configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -2032,14 +2034,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index b20f1690a7..05286c7365 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELECT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 92c4fee8f8..5357e51d3f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1544,7 +1544,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8b2d46d31c..eceb586fb2 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3483,26 +3483,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3783,6 +3818,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 3a8cddf4ff..701c4a66fe 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4f0d20a696..caddf0ddc5 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -431,6 +431,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.21.0
0004-New-prefer-read-target_session_attrs-type.patchapplication/octet-stream; name=0004-New-prefer-read-target_session_attrs-type.patchDownload
From 0379a556ac79f76cbe690b5fad5bbae293d4b53c Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:56:48 +1100
Subject: [PATCH 4/8] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 161 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 177 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6107755416..67f2c6a5c1 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1648,12 +1648,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1662,6 +1662,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1671,7 +1678,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index eceb586fb2..59ee803340 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -340,7 +340,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1298,6 +1298,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2231,13 +2233,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3474,7 +3494,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3489,8 +3510,8 @@ keep_going: /* We will come back to here until there is
* Save existing error messages across the PQsendQuery
* attempt. This is necessary because PQsendQuery is
* going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
*/
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3502,16 +3523,30 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3521,16 +3556,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3538,8 +3585,36 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3605,11 +3680,33 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3622,16 +3719,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3640,7 +3749,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3827,6 +3937,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index b76022a96e..f3833678e2 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index caddf0ddc5..4e8bab3822 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -364,7 +364,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -401,6 +404,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 3c743d7d7c..af465be505 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 32;
+use Test::More tests => 35;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.21.0
0005-New-read-only-target_session_attrs-type.patchapplication/octet-stream; name=0005-New-read-only-target_session_attrs-type.patchDownload
From 938a50703283eaf8983ca33739eea308248d8587 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 18:02:59 +1100
Subject: [PATCH 5/8] New read-only target_session_attrs type
With this read-only option type, application can connect
to a read-only server in the list of hosts, in case
if there is no read-only server available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 7 +++++-
src/interfaces/libpq/fe-connect.c | 34 ++++++++++++++++++++-------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++-
5 files changed, 43 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 67f2c6a5c1..2be51b2a49 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1677,6 +1677,11 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
configuration parameter that is reported by the server upon successful connection.
</para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 59ee803340..3217ae0303 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1300,6 +1300,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3494,8 +3496,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
- * if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3532,7 +3534,8 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3592,17 +3595,28 @@ keep_going: /* We will come back to here until there is
/*
* Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3690,12 +3704,14 @@ keep_going: /* We will come back to here until there is
* ignore it. Server is read-write and requested mode is
* prefer-read, record it for the first time and try to
* consume in the next scan (it means no read-only server
- * is found in the first scan).
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
*/
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/*
* The following scenario is possible only for the
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index f3833678e2..a015579c43 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -75,7 +75,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4e8bab3822..a00d8bddeb 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,7 +366,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index af465be505..ac1e11e1ab 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 35;
+use Test::More tests => 37;
# Initialize master node
my $node_master = get_new_node('master');
@@ -133,6 +133,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.21.0
0006-Primary-prefer-standby-and-standby-options.patchapplication/octet-stream; name=0006-Primary-prefer-standby-and-standby-options.patchDownload
From 650ebafc126f4f77fa039c07553498fd3e6d2db1 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 10 Apr 2019 23:19:09 +1000
Subject: [PATCH 6/8] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 236 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 262 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 2be51b2a49..6b84accd08 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1682,6 +1683,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
If this parameter is set to <literal>read-only</literal>, only a connection
in which read-only transactions are accepted by default.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ where the server is not in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, only a connection in which
+ where the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ where the server is in recovery mode.
+ </para>
+
+ <para>
+ To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
+ will be sent upon any successful connection; if it returns <literal>t</literal>, means server
+ is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 3217ae0303..59efaa5133 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -124,6 +124,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -340,7 +341,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1302,6 +1303,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2196,6 +2203,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2235,19 +2243,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means, failed to connect to read-only servers
* and now try connect to read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3504,7 +3512,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3547,7 +3557,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
/* Append error report to conn->errorMessage. */
@@ -3578,8 +3588,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3594,30 +3604,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * severs before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
*/
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
+ */
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3719,7 +3769,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
@@ -3754,8 +3804,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3808,6 +3858,144 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3953,7 +4141,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index a015579c43..6c5780c672 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -76,7 +77,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index a00d8bddeb..740033e116 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -366,7 +366,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -410,7 +410,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ac1e11e1ab..8fa28dab23 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 37;
+use Test::More tests => 41;
# Initialize master node
my $node_master = get_new_node('master');
@@ -141,6 +141,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.21.0
0007-New-function-to-rejecting-the-checked-write-connecti.patchapplication/octet-stream; name=0007-New-function-to-rejecting-the-checked-write-connecti.patchDownload
From c58ace1fb9356ccab5b2c9f7fcbb1bf665da9960 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 18:11:18 +1100
Subject: [PATCH 7/8] New function to rejecting the checked write connection
When the connection is checked for write or not and based
on the result, if we decide to reject it, call the newly
added function to reject it.
---
src/interfaces/libpq/fe-connect.c | 123 ++++++++++++------------------
1 file changed, 47 insertions(+), 76 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 59efaa5133..529985ea0c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2121,6 +2121,51 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+static void
+reject_checked_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3547,10 +3592,6 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
@@ -3560,42 +3601,7 @@ keep_going: /* We will come back to here until there is
if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
@@ -3776,42 +3782,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
--
2.21.0
0008-Server-recovery-mode-handling.patchapplication/octet-stream; name=0008-Server-recovery-mode-handling.patchDownload
From f6a62b5a53c16ab793ff5ada2d6fd06d39d9e04a Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 28 Mar 2019 15:30:01 +1100
Subject: [PATCH 8/8] Server recovery mode handling
in_recovery GUC_REPORT is added to update the clients when the
server is recovery mode, this is useful for the client connections
to connect to a standby server with a faster check instead of
executing a command.
New SIGUSR1 handling interrupt is added to support reporting
of recovery mode exit to all backends and their respective
clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
---
doc/src/sgml/libpq.sgml | 14 ++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/commands/async.c | 1 -
src/backend/storage/ipc/procarray.c | 28 ++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 ++
src/backend/tcop/postgres.c | 60 ++++++++++++
src/backend/utils/init/postinit.c | 6 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 17 +++-
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 133 +++++++++++++++++----------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
18 files changed, 234 insertions(+), 61 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 6b84accd08..90e8c96820 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1702,8 +1702,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
- will be sent upon any successful connection; if it returns <literal>t</literal>, means server
- is in recovery mode.
+ will be sent upon any successful connection if the server is prior to version 12; if it returns
+ <literal>t</literal>, it means server is in recovery mode. But for server version 12 or greater
+ uses the value of <varname>in_recovery</varname> configuration parameter that is reported by the
+ server upon successful connection.
</para>
</listitem>
@@ -2071,15 +2073,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by release before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 05286c7365..9ed2a03b73 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELECT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 13e0d2366f..622b0870af 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7766,6 +7766,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryInProgress = false;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/commands/async.c b/src/backend/commands/async.c
index 738e6ec7e2..102094e765 100644
--- a/src/backend/commands/async.c
+++ b/src/backend/commands/async.c
@@ -1736,7 +1736,6 @@ ProcessNotifyInterrupt(void)
ProcessIncomingNotify();
}
-
/*
* Read all pending notifications from the queue, and deliver appropriate
* ones to my frontend. Stop when we reach queue head or an uncommitted
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 18a0f62ba6..ad72ab10c7 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2972,6 +2972,34 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
return true; /* timed out, still conflicts */
}
+/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
/*
* ProcArraySetReplicationSlotXmin
*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c367..e4548dc323 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 25b7e314af..d55366af4e 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..33511a9433 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -167,6 +167,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -195,6 +204,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -543,6 +553,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -2954,6 +2968,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
errno = save_errno;
}
+/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index e9f72b5069..29a159dc11 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -651,7 +651,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* This is handled by calling RecoveryInProgress and ignoring the
* result.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index d228bbed68..1c0f51ac8b 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_int \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 5357e51d3f..719ea6fc24 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -581,7 +581,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
-
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
char *role_string;
@@ -1770,6 +1770,21 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index da8b672096..86f0c13134 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 05b186a05c..9cf9560b06 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index a3f8f82ff3..2c73f0c0a8 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index ec21f7e45c..ed21a9e2f2 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -66,6 +66,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 529985ea0c..82d952eafc 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2166,6 +2166,49 @@ reject_checked_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3619,27 +3662,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
}
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
- conn->status = CONNECTION_CHECK_RECOVERY;
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
+ }
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3888,40 +3956,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 701c4a66fe..7cffed57cd 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 740033e116..7b36ddb804 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -443,6 +443,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.21.0
Question about 0001. In the CONNECTION_SETENV state, you end by falling
through to the CONNECTION_CHECK_TARGET case; but in that switch it seems
a bit unnatural to do that. I think doing "goto keep_trying" is another
way of doing the same thing, but more in line with what every other
piece of code does.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Restructure-the-code-to-remove-duplicate-code.patchtext/x-diff; charset=us-asciiDownload
From 30fbb4a913697fe35a195dd41ddd5bcfeec53c0e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Fri, 6 Sep 2019 16:26:01 -0400
Subject: [PATCH] Restructure the code to remove duplicate code
The duplicate code logic of checking for the server version
before issuing the SHOW transaction_read_only to find out whether
the server is read-write or not is restructured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches
Author: Hari Babu Kommi
Discussion: https://postgr.es/m/CAJrrPGe_qgdbbN+yBgEVpd+YLHXXjTruzk6RmTMhqrFig+32ag@mail.gmail.com
---
src/interfaces/libpq/fe-connect.c | 87 ++++++++++++-------------------
src/interfaces/libpq/libpq-fe.h | 3 +-
2 files changed, 34 insertions(+), 56 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f03d43376c..91316709a5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3434,6 +3434,13 @@ keep_going: /* We will come back to here until there is
return PGRES_POLLING_WRITING;
}
+ /* Almost there now ... */
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_TARGET:
+ {
/*
* If a read-write connection is required, see if we have one.
*
@@ -3476,67 +3483,37 @@ keep_going: /* We will come back to here until there is
}
case CONNECTION_SETENV:
-
- /*
- * Do post-connection housekeeping (only needed in protocol 2.0).
- *
- * We pretend that the connection is OK for the duration of these
- * queries.
- */
- conn->status = CONNECTION_OK;
-
- switch (pqSetenvPoll(conn))
{
- case PGRES_POLLING_OK: /* Success */
- break;
-
- case PGRES_POLLING_READING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_READING;
-
- case PGRES_POLLING_WRITING: /* Still going */
- conn->status = CONNECTION_SETENV;
- return PGRES_POLLING_WRITING;
-
- default:
- goto error_return;
- }
-
- /*
- * If a read-write connection is required, see if we have one.
- * (This should match the stanza in the CONNECTION_AUTH_OK case
- * above.)
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but by
- * the same token they don't have any read-only mode, so we may
- * just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
- {
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
+ /*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+
+ switch (pqSetenvPoll(conn))
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ case PGRES_POLLING_OK: /* Success */
+ break;
+
+ case PGRES_POLLING_READING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_READING;
+
+ case PGRES_POLLING_WRITING: /* Still going */
+ conn->status = CONNECTION_SETENV;
+ return PGRES_POLLING_WRITING;
+
+ default:
+ goto error_return;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+
+ /* Almost there now ... */
+ conn->status = CONNECTION_CHECK_TARGET;
+ goto keep_going;
}
- /* We can release the address list now. */
- release_conn_addrinfo(conn);
-
- /* We are open for business! */
- conn->status = CONNECTION_OK;
- return PGRES_POLLING_OK;
-
case CONNECTION_CONSUME:
{
conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 22c4954f2b..5f65db30e4 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
* connection. */
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
- CONNECTION_GSS_STARTUP /* Negotiating GSSAPI. */
+ CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
+ CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
typedef enum
--
2.17.1
On 2019-Sep-09, Alvaro Herrera from 2ndQuadrant wrote:
Question about 0001. In the CONNECTION_SETENV state, you end by falling
through to the CONNECTION_CHECK_TARGET case; but in that switch it seems
a bit unnatural to do that. I think doing "goto keep_trying" is another
way of doing the same thing, but more in line with what every other
piece of code does.
Appears to work. Pushed like that.
Testing protocol version 2 is difficult! Almost every single test fails
because of error messages being reported differently; and streaming
replication (incl. pg_basebackup) doesn't work at all because it's not
possible to establish replication connections. Manual inspection shows
it behaves correctly.
Remaining patchset attached (per my count it's v13 of your patchset.
Please use "git format-patch -v14" and so on when posting patches). I
stripped the doc change from 0001 (unchanged) because I found it hard to
justify on its own, and it has a couple of grammatical mistakes. I
think we should merge one half of it with each of the other two patches
where the changes are introduced (0003 and 0004). I'm not convinced
that we need 0004-0006 to be separate commits.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: Alvaro Herrera from 2ndQuadrant [mailto:alvherre@alvh.no-ip.org]
Testing protocol version 2 is difficult! Almost every single test fails
because of error messages being reported differently; and streaming
replication (incl. pg_basebackup) doesn't work at all because it's not
possible to establish replication connections. Manual inspection shows
it behaves correctly.
Yeah, the code path for protocol v2 is sometimes annoying. I wish v2 support will be dropped soon. I know there was a discussion on it, but I didn't track the conclusion.
Remaining patchset attached (per my count it's v13 of your patchset.
I'm afraid those weren't attached.
think we should merge one half of it with each of the other two patches
where the changes are introduced (0003 and 0004). I'm not convinced
that we need 0004-0006 to be separate commits.
It was hard to review those separate patches, so I think it's better to merge those. OTOH, I can understand Haribabu-san's idea that the community may not accept the latter patches, e.g. accept only 0001-0005.
Regards
Takayuki Tsunakawa
On 2019-Sep-11, Tsunakawa, Takayuki wrote:
From: Alvaro Herrera from 2ndQuadrant [mailto:alvherre@alvh.no-ip.org]
Remaining patchset attached (per my count it's v13 of your patchset.
I'm afraid those weren't attached.
Oh, oops. Here they are then.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v13-0002-doc-change.patchtext/x-diff; charset=us-asciiDownload
From 9399e3f7f5e85f41871d4e586b0582f697380c0b Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Tue, 10 Sep 2019 12:48:28 -0300
Subject: [PATCH v13 2/8] doc change
---
doc/src/sgml/libpq.sgml | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5601485555..23bf1ea632 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1647,17 +1647,27 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<varlistentry id="libpq-connect-target-session-attrs" xreflabel="target_session_attrs">
<term><literal>target_session_attrs</literal></term>
<listitem>
+ <para>
+ The supported options for this parameter are, <literal>any</literal> and
+ <literal>read-write</literal>. The default value of this parameter,
+ <literal>any</literal>, regards all connections as acceptable.
+ If multiple hosts were specified in the connection string, based on the
+ specified value, any remaining servers will be tried before confirming
+ succesful connection or failure.
+ </para>
+
<para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ To find out whether the server supports read-write transactions are not,
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection; if it returns <literal>on</literal>, it means server
+ doesn't support read-write transactions.
+ </para>
</listitem>
</varlistentry>
</variablelist>
--
2.17.1
v13-0001-New-TargetSessionAttrsType-enum.patchtext/x-diff; charset=us-asciiDownload
From 247e888266882ab673efd04ecf017846400859ad Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Feb 2019 11:50:33 +1100
Subject: [PATCH v13 1/8] New TargetSessionAttrsType enum
This new enum is useful to compare the requested session type
instead of comparing it with string always. This may not show
much improvement with current code, but it will be useful with
further patches
---
src/interfaces/libpq/fe-connect.c | 12 ++++++++----
src/interfaces/libpq/libpq-fe.h | 6 ++++++
src/interfaces/libpq/libpq-int.h | 1 +
3 files changed, 15 insertions(+), 4 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8ba0159313..f104fd48aa 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1295,8 +1295,11 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -3449,8 +3452,7 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
{
/*
* Save existing error messages across the PQsendQuery
@@ -3791,6 +3793,8 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_session_type = SESSION_TYPE_ANY;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 5f65db30e4..0612e68c62 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -71,6 +71,12 @@ typedef enum
CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
} ConnStatusType;
+typedef enum
+{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE /* Read-write session */
+} TargetSessionAttrsType;
+
typedef enum
{
PGRES_POLLING_FAILED = 0,
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index d37bb3ce40..20791b5b73 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -367,6 +367,7 @@ struct pg_conn
/* Type of connection to make. Possible values: any, read-write. */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
--
2.17.1
v13-0003-Make-transaction_read_only-as-GUC_REPORT-variabl.patchtext/x-diff; charset=us-asciiDownload
From c0b48576c961ed0c0304ac5c83363acf6a51f818 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:05:24 +1100
Subject: [PATCH v13 3/8] Make transaction_read_only as GUC_REPORT variable
transaction_read_only GUC variable value is used in multi host
connection to identify the required host of read-write, but currently
this carried out by executing a command to find out whether the host
is a read-write or not? Instead of that, Reporting the GUC to the client
upon connection reduces the time to make the connection.
---
doc/src/sgml/libpq.sgml | 14 ++++---
doc/src/sgml/protocol.sgml | 8 ++--
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 70 +++++++++++++++++++++++--------
src/interfaces/libpq/fe-exec.c | 6 ++-
src/interfaces/libpq/libpq-int.h | 1 +
6 files changed, 74 insertions(+), 27 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 23bf1ea632..edda94196b 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1665,8 +1665,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, it means server
- doesn't support read-write transactions.
+ successful connection if the server is prior to version 12; if it returns
+ <literal>on</literal>, it means server doesn't support read-write transactions.
+ But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
+ configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
</varlistentry>
@@ -1977,14 +1979,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 80275215e0..dbf12fcc46 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELECT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 90ffd89339..3c47473bcd 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1543,7 +1543,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f104fd48aa..abac93d6da 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3454,26 +3454,61 @@ keep_going: /* We will come back to here until there is
if (conn->sversion >= 70400 &&
conn->requested_session_type != SESSION_TYPE_ANY)
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
-
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if (conn->transaction_read_only)
+ {
+ /* Not writable; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
/* We can release the address list now. */
@@ -3786,6 +3821,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b3c59a0992..3c17100e05 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 20791b5b73..7d26b94f9f 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -432,6 +432,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.17.1
v13-0004-New-prefer-read-target_session_attrs-type.patchtext/x-diff; charset=us-asciiDownload
From 2dee9b996ed8f242ceb5d0f1924a9c0455620e53 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 17:56:48 +1100
Subject: [PATCH v13 4/8] New prefer-read target_session_attrs type
With this prefer-read option type, application can prefer
connecting to a read-only server if available from the list
of hosts, otherwise connect it to read-write server
---
doc/src/sgml/libpq.sgml | 21 ++--
src/interfaces/libpq/fe-connect.c | 161 ++++++++++++++++++++++----
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 13 ++-
src/test/recovery/t/001_stream_rep.pl | 14 ++-
5 files changed, 177 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index edda94196b..29f9ae5c78 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1648,12 +1648,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- The supported options for this parameter are, <literal>any</literal> and
- <literal>read-write</literal>. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- If multiple hosts were specified in the connection string, based on the
- specified value, any remaining servers will be tried before confirming
- succesful connection or failure.
+ The supported options for this parameter are, <literal>any</literal>,
+ <literal>read-write</literal> and <literal>prefer-read</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts were specified in the
+ connection string, based on the specified value, any remaining servers
+ will be tried before confirming succesful connection or failure.
</para>
<para>
@@ -1662,6 +1662,13 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
is considered acceptable.
</para>
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, only a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions accepted will be considered.
+ </para>
+
<para>
To find out whether the server supports read-write transactions are not,
query <literal>SHOW transaction_read_only</literal> will be sent upon any
@@ -1671,7 +1678,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
configuration parameter that is reported by the server upon successful connection.
</para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index abac93d6da..8673b8e903 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -341,7 +341,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1299,6 +1299,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_ANY;
else if (strcmp(conn->target_session_attrs, "read-write") == 0)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else
{
conn->status = CONNECTION_BAD;
@@ -2232,13 +2234,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means, failed to connect to read-only servers
+ * and now try connect to read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3445,7 +3465,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write or prefer-read connection is required, see
+ * if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3460,8 +3481,8 @@ keep_going: /* We will come back to here until there is
* Save existing error messages across the PQsendQuery
* attempt. This is necessary because PQsendQuery is
* going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
*/
if (!saveErrorMessage(conn, &savedMessage))
goto error_return;
@@ -3473,16 +3494,30 @@ keep_going: /* We will come back to here until there is
restoreErrorMessage(conn, &savedMessage);
goto error_return;
}
+
conn->status = CONNECTION_CHECK_WRITABLE;
+
restoreErrorMessage(conn, &savedMessage);
return PGRES_POLLING_READING;
}
- else if (conn->transaction_read_only)
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
const char *displayed_host;
const char *displayed_port;
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3492,16 +3527,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3509,8 +3556,36 @@ keep_going: /* We will come back to here until there is
conn->try_next_host = true;
goto keep_going;
}
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index != -2)
+ {
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3608,11 +3683,33 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan).
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3625,16 +3722,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3643,7 +3752,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3830,6 +3940,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 0612e68c62..563f8b98ce 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -74,7 +74,8 @@ typedef enum
typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
- SESSION_TYPE_READ_WRITE /* Read-write session */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ /* Prefer read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 7d26b94f9f..174f370818 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -365,7 +365,10 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read.
+ */
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -402,6 +405,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 3c743d7d7c..af465be505 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 32;
+use Test::More tests => 35;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,18 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.17.1
v13-0005-New-read-only-target_session_attrs-type.patchtext/x-diff; charset=us-asciiDownload
From 356b5c8969b04bd29c69de888efe5ab1cd8e3e3d Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 27 Mar 2019 18:02:59 +1100
Subject: [PATCH v13 5/8] New read-only target_session_attrs type
With this read-only option type, application can connect
to a read-only server in the list of hosts, in case
if there is no read-only server available, the connection
attempt fails.
---
doc/src/sgml/libpq.sgml | 7 +++++-
src/interfaces/libpq/fe-connect.c | 34 ++++++++++++++++++++-------
src/interfaces/libpq/libpq-fe.h | 3 ++-
src/interfaces/libpq/libpq-int.h | 2 +-
src/test/recovery/t/001_stream_rep.pl | 10 +++++++-
5 files changed, 43 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 29f9ae5c78..3a1071e408 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,7 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal> and <literal>prefer-read</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1677,6 +1677,11 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
But for server version 12 or greater uses the value of <varname>transaction_read_only</varname>
configuration parameter that is reported by the server upon successful connection.
</para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default.
+ </para>
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8673b8e903..e660675808 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1301,6 +1301,8 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_READ_WRITE;
else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
else
{
conn->status = CONNECTION_BAD;
@@ -3465,8 +3467,8 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write or prefer-read connection is required, see
- * if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
@@ -3503,7 +3505,8 @@ keep_going: /* We will come back to here until there is
else if ((conn->transaction_read_only &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!conn->transaction_read_only &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/* Not a requested type; fail this connection. */
const char *displayed_host;
@@ -3563,17 +3566,28 @@ keep_going: /* We will come back to here until there is
/*
* Requested type is prefer-read, then record this host index
- * and try the other before considering it later
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
*/
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index != -2)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->read_write_host_index == -1)
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
conn->read_write_host_index = conn->whichhost;
/*
@@ -3693,12 +3707,14 @@ keep_going: /* We will come back to here until there is
* ignore it. Server is read-write and requested mode is
* prefer-read, record it for the first time and try to
* consume in the next scan (it means no read-only server
- * is found in the first scan).
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
*/
if ((readonly_server &&
conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
(!readonly_server &&
- conn->requested_session_type == SESSION_TYPE_PREFER_READ))
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
/*
* The following scenario is possible only for the
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 563f8b98ce..fc0178cd5d 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -75,7 +75,8 @@ typedef enum
{
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
- SESSION_TYPE_PREFER_READ /* Prefer read only session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 174f370818..e3c15b2ba8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -367,7 +367,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read.
+ * prefer-read and read-only.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index af465be505..ac1e11e1ab 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 35;
+use Test::More tests => 37;
# Initialize master node
my $node_master = get_new_node('master');
@@ -133,6 +133,14 @@ test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
test_target_session_attrs($node_master, $node_master, $node_master,
"prefer-read", 0);
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.17.1
v13-0006-Primary-prefer-standby-and-standby-options.patchtext/x-diff; charset=us-asciiDownload
From 9e551ce44a04a4646518c66aee146e4fc7c0c0f9 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Wed, 10 Apr 2019 23:19:09 +1000
Subject: [PATCH v13 6/8] Primary, prefer-standby and standby options
New options to check whether the server is in recovery mode
or not, before considering them to connect. To confirm whether
the server is running in recovery mode or not, it sends the query
to server as 'SELECT pg_is_in_recovery()'.
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 236 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-fe.h | 8 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 262 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 3a1071e408..e447e8fad7 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1649,7 +1649,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are, <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts were specified in the
connection string, based on the specified value, any remaining servers
@@ -1682,6 +1683,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
If this parameter is set to <literal>read-only</literal>, only a connection
in which read-only transactions are accepted by default.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ where the server is not in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, only a connection in which
+ where the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ where the server is in recovery mode.
+ </para>
+
+ <para>
+ To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
+ will be sent upon any successful connection; if it returns <literal>t</literal>, means server
+ is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e660675808..2e1872795a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -125,6 +125,7 @@ static int ldapServiceLookup(const char *purl, PQconninfoOption *options,
#define DefaultOption ""
#define DefaultAuthtype ""
#define DefaultTargetSessionAttrs "any"
+#define DefaultTargetServerType "any"
#ifdef USE_SSL
#define DefaultSSLMode "prefer"
#else
@@ -341,7 +342,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1303,6 +1304,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2197,6 +2204,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2236,19 +2244,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means, failed to connect to read-only servers
* and now try connect to read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3475,7 +3483,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 120000)
{
@@ -3518,7 +3528,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
/* Append error report to conn->errorMessage. */
@@ -3549,8 +3559,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3565,30 +3575,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * severs before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
*/
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
+ */
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3722,7 +3772,7 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
@@ -3757,8 +3807,8 @@ keep_going: /* We will come back to here until there is
/* Record read-write host index */
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3811,6 +3861,144 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3956,7 +4144,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index fc0178cd5d..30b181aa37 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -76,7 +77,10 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
} TargetSessionAttrsType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index e3c15b2ba8..d9e38558c4 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -367,7 +367,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -411,7 +411,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ac1e11e1ab..8fa28dab23 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 37;
+use Test::More tests => 41;
# Initialize master node
my $node_master = get_new_node('master');
@@ -141,6 +141,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
2.17.1
v13-0007-New-function-to-rejecting-the-checked-write-conn.patchtext/x-diff; charset=us-asciiDownload
From 599bcd4c876065f06489de5820211d42ab3b6ece Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Mon, 25 Mar 2019 18:11:18 +1100
Subject: [PATCH v13 7/8] New function to rejecting the checked write
connection
When the connection is checked for write or not and based
on the result, if we decide to reject it, call the newly
added function to reject it.
---
src/interfaces/libpq/fe-connect.c | 123 ++++++++++++------------------
1 file changed, 47 insertions(+), 76 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 2e1872795a..f9075d2c10 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2122,6 +2122,51 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+static void
+reject_checked_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3518,10 +3563,6 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
@@ -3531,42 +3572,7 @@ keep_going: /* We will come back to here until there is
if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
@@ -3779,42 +3785,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_write_connection(conn);
goto keep_going;
}
--
2.17.1
v13-0008-Server-recovery-mode-handling.patchtext/x-diff; charset=us-asciiDownload
From 820e747f91a0d57d5018519f59a148fb50249b57 Mon Sep 17 00:00:00 2001
From: Hari Babu <kommi.haribabu@gmail.com>
Date: Thu, 28 Mar 2019 15:30:01 +1100
Subject: [PATCH v13 8/8] Server recovery mode handling
in_recovery GUC_REPORT is added to update the clients when the
server is recovery mode, this is useful for the client connections
to connect to a standby server with a faster check instead of
executing a command.
New SIGUSR1 handling interrupt is added to support reporting
of recovery mode exit to all backends and their respective
clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
---
doc/src/sgml/libpq.sgml | 14 ++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 ++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 ++
src/backend/tcop/postgres.c | 60 ++++++++++++
src/backend/utils/init/postinit.c | 6 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 16 ++++
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 135 +++++++++++++++++----------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
17 files changed, 235 insertions(+), 60 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index e447e8fad7..dc1e89bb2a 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1702,8 +1702,10 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To find out whether the server is in recovery mode or not, query <literal>SELECT pg_is_in_recovery()</literal>
- will be sent upon any successful connection; if it returns <literal>t</literal>, means server
- is in recovery mode.
+ will be sent upon any successful connection if the server is prior to version 12; if it returns
+ <literal>t</literal>, it means server is in recovery mode. But for server version 12 or greater
+ uses the value of <varname>in_recovery</varname> configuration parameter that is reported by the
+ server upon successful connection.
</para>
</listitem>
@@ -2016,15 +2018,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by release before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by release before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index dbf12fcc46..58142c072d 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELECT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 12.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 12.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6876537b62..5093fd183a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7769,6 +7769,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryInProgress = false;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 8abcfdf841..744475cc2c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2970,6 +2970,34 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
return true; /* timed out, still conflicts */
}
+/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
/*
* ProcArraySetReplicationSlotXmin
*
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c367..e4548dc323 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 01ddffec40..b0e88ee545 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e8d8e6f828..69ce3ec786 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -167,6 +167,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -195,6 +204,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -543,6 +553,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -2961,6 +2975,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
errno = save_errno;
}
+/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 29c5ec7b58..59fb4e905b 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -649,7 +649,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* This is handled by calling RecoveryInProgress and ignoring the
* result.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a0875b6..a4ebcef614 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 3c47473bcd..aacaecc520 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -580,6 +580,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1769,6 +1770,21 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
{
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index da8b672096..86f0c13134 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 05b186a05c..9cf9560b06 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index a3f8f82ff3..2c73f0c0a8 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index ec21f7e45c..ed21a9e2f2 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -66,6 +66,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f9075d2c10..585339b537 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2167,6 +2167,49 @@ reject_checked_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3590,27 +3633,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 120000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
+
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_RECOVERY;
-
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3891,40 +3959,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 3c17100e05..a964483eee 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index d9e38558c4..436109b41c 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -444,6 +444,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
2.17.1
On Wed, Sep 11, 2019 at 10:17 AM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:
Oh, oops. Here they are then.
With the permission of the original patch author, Haribabu Kommi, I’ve
rationalized the existing 8 patches into 3 patches, merging patches
1-5 and 6-7, and tidying up some documentation and code comments. I
also rebased them to the latest PG12 source code (as of October 1,
2019). The patch code itself is the same, except for some version
checks that I have updated to target the features for PG13 instead of
PG12.
I’ve attached the updated patches.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v14-0001-libpq-target_session_attrs-read_write-prefer_read-read_only.patchapplication/octet-stream; name=v14-0001-libpq-target_session_attrs-read_write-prefer_read-read_only.patchDownload
From 74cad19872e70ffe937ad582fab30bb8f17f7a16 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 30 Sep 2019 11:55:57 +1000
Subject: [PATCH v14 1/3] Enhance libpq target_session_attrs:
read-write/prefer-read/read-only
Improve checking of the requested target session type, to avoid always having to do string
comparisons on target_session_attrs.
Make "transaction_read_only" a GUC_REPORT variable, to avoid having to execute a query
post-connection in order to determine whether a host is read-write (and reduce time to make the
connection).
Add new "prefer-read" target_session_attrs option value, to support connecting to a read-only
server if available from the list of hosts (otherwise connect to a read-write server).
Add new "read-only" target_session_attrs option value, to support connecting to a read-only
server if available from the list of hosts (otherwise the connection attempt fails).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 54 ++++++--
doc/src/sgml/protocol.sgml | 8 +-
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 239 +++++++++++++++++++++++++++++-----
src/interfaces/libpq/fe-exec.c | 6 +-
src/interfaces/libpq/libpq-fe.h | 8 ++
src/interfaces/libpq/libpq-int.h | 15 ++-
src/test/recovery/t/001_stream_rep.pl | 22 +++-
8 files changed, 299 insertions(+), 55 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c58527b..0d3edfc 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1674,18 +1674,46 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions are accepted will be considered.
+ </para>
+
+ <para>
+ To determine whether the server supports read-write transactions, the
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection, if the server is prior to version 13; if it returns
+ <literal>on</literal>, it means the server doesn't support read-write
+ transactions.
+ If the server is version 13 or greater, the support of read-write
+ transactions is determined by the value of the
+ <varname>transaction_read_only</varname> configuration parameter that is
+ reported by the server upon successful connection.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default is considered
+ acceptable.
+ </para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -1993,14 +2021,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 8027521..87b95bc 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELECT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 2178e1c..c64ec03 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1543,7 +1543,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..7058f5d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -350,7 +350,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1327,8 +1327,15 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2261,13 +2268,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to read-only servers
+ * and should now try to connect to a read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3474,38 +3499,139 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
+ {
+ if (conn->sversion < 130000)
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_WRITABLE;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
+ {
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+ /* Close connection politely. */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3604,11 +3730,35 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3621,16 +3771,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3639,7 +3801,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3817,6 +3980,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
@@ -3824,6 +3988,9 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b3c59a0..3c17100 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 5f65db3..aa6f22f 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -73,6 +73,14 @@ typedef enum
typedef enum
{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
+} TargetSessionAttrsType;
+
+typedef enum
+{
PGRES_POLLING_FAILED = 0,
PGRES_POLLING_READING, /* These two indicate that one may */
PGRES_POLLING_WRITING, /* use select before polling again. */
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 64468ab..eb687d8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -367,8 +367,12 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read and read-only.
+ */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -403,6 +407,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -433,6 +445,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 3c743d7..ac1e11e 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 32;
+use Test::More tests => 37;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,26 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
v14-0003-Server-recovery-mode-handling.patchapplication/octet-stream; name=v14-0003-Server-recovery-mode-handling.patchDownload
From 6ce04730b7fcc6fa532cbae3a2c149e3da6ae03c Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 30 Sep 2019 15:45:46 +1000
Subject: [PATCH v14 3/3] Server recovery mode handling
Add "in_recovery" as a GUC_REPORT variable, to update clients when the
server is in recovery mode. This improves the speed of client connections
to a standby server, by avoiding the need to execute a command to
determine if the server is in recovery mode.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode
exit to all backends and their respective clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 16 ++--
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 +++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +++
src/backend/tcop/postgres.c | 60 +++++++++++++++
src/backend/utils/init/postinit.c | 6 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 16 ++++
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 142 +++++++++++++++++++++++------------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
17 files changed, 245 insertions(+), 59 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5f31fd0..9f66fb8 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1732,8 +1732,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To determine whether the server is in recovery mode, the query
- <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful connection;
- if it returns <literal>t</literal>, it means the server is in recovery mode.
+ <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful connection
+ if the server is prior to version 13; if it returns <literal>t</literal>, it means the server
+ is in recovery mode.
+ If the server is version 13 or greater, the recovery mode state is determined by the value of
+ the <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection.
</para>
</listitem>
@@ -2046,15 +2050,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 87b95bc..df3953f 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELECT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6c69eb6..4fe506e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7770,6 +7770,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryInProgress = false;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 8abcfdf..744475c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2971,6 +2971,34 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c..e4548dc 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 01ddffe..b0e88ee 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e8d8e6f..69ce3ec 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -167,6 +167,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -195,6 +204,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -543,6 +553,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -2962,6 +2976,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 29c5ec7..59fb4e9 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -649,7 +649,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* This is handled by calling RecoveryInProgress and ignoring the
* result.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c64ec03..633bc2d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -580,6 +580,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1770,6 +1771,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index da8b672..86f0c13 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 05b186a..9cf9560 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index a3f8f82..2c73f0c 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index ec21f7e..ed21a9e 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -66,6 +66,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 0c096d1..7a04618 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2207,6 +2207,58 @@ reject_checked_read_or_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested session type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SESSION_TYPE_PREFER_STANDBY, if the read-write-or-primary host-index
+ * hasn't been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3630,27 +3682,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 130000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
}
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
- conn->status = CONNECTION_CHECK_RECOVERY;
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
+ }
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3932,40 +4009,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 3c17100..a964483 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 0efe16f..327dd04 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -446,6 +446,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
1.8.3.1
v14-0002-libpq-target_session_attrs-primary-prefer_standby-standby.patchapplication/octet-stream; name=v14-0002-libpq-target_session_attrs-primary-prefer_standby-standby.patchDownload
From cb2b1b9d07cce3cb13d9ed2276b683466d6a73fa Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 30 Sep 2019 13:23:57 +1000
Subject: [PATCH v14 2/3] Enhance libpq target_session_attrs:
primary/prefer-standby/standby
Add new "primary" target_session_attrs option value, to support connecting to a server which is not
in recovery mode, if available from the list of hosts (otherwise the connection attempt fails).
Add new "prefer-standby" target_session_attrs option value, to support connecting to a server which
is in recovery mode, if available from the list of hosts (otherwise connect to a server which is
not in recovery mode).
Add new "standby" target_session_attrs option value, to support connecting to a server which is in
recovery mode, if available from the list of hosts (otherwise the connection attempt fails).
To determine if running in recovery mode, the server is sent the query 'SELECT pg_is_in_recovery()'.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 313 ++++++++++++++++++++++++++--------
src/interfaces/libpq/libpq-fe.h | 10 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 291 insertions(+), 80 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0d3edfc..5f31fd0 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1675,7 +1675,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts are specified in the
connection string, each host is tried in the order given until a connection
@@ -1712,6 +1713,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
in which read-only transactions are accepted by default is considered
acceptable.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ the server is not in recovery mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable.
+ </para>
+
+ <para>
+ To determine whether the server is in recovery mode, the query
+ <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful connection;
+ if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7058f5d..0c096d1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -350,7 +350,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1335,6 +1335,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2147,6 +2153,60 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested session type. The connection state is set to
+ * try the next host (if any).
+ * In the case of SESSION_TYPE_PREFER_READ, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a read-only host could be made after the first host scan.
+ */
+static void
+reject_checked_read_or_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2229,6 +2289,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2268,19 +2329,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means we failed to connect to read-only servers
* and should now try to connect to a read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3507,7 +3568,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 130000)
{
@@ -3540,55 +3603,16 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_read_or_write_connection(conn);
goto keep_going;
}
@@ -3597,30 +3621,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * Servers before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
+ */
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
*/
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3755,13 +3819,119 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
+ reject_checked_read_or_write_connection(conn);
+ goto keep_going;
+ }
+
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3771,16 +3941,14 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
+ libpq_gettext("server is in recovery mode "
"\"%s:%s\"\n"),
displayed_host, displayed_port);
else
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
+ libpq_gettext("server is not in recovery mode "
"\"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3788,10 +3956,10 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3801,7 +3969,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- consume_checked_write_connection:
+ consume_checked_recovery_connection:
/* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3815,7 +3983,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3831,7 +3999,7 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
"on server \"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3843,7 +4011,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3989,7 +4156,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index aa6f22f..30b181a 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -76,8 +77,11 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
-} TargetSessionAttrsType;
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
+} TargetSessionAttrsType;
typedef enum
{
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index eb687d8..0efe16f 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -369,7 +369,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -413,7 +413,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ac1e11e..8fa28da 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 37;
+use Test::More tests => 41;
# Initialize master node
my $node_master = get_new_node('master');
@@ -141,6 +141,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
On Wed, Sep 11, 2019 at 10:17 AM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:
Oh, oops. Here they are then.
With the permission of the original patch author, Haribabu Kommi, I’ve
rationalized the existing 8 patches into 3 patches, merging patches
1-5 and 6-7, and tidying up some documentation and code comments. I
also rebased them to the latest PG12 source code (as of October 1,
2019). The patch code itself is the same, except for some version
checks that I have updated to target the features for PG13 instead of
PG12.
I’ve attached the updated patches.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v14-0001-libpq-target_session_attrs-read_write-prefer_read-read_only.patchapplication/octet-stream; name=v14-0001-libpq-target_session_attrs-read_write-prefer_read-read_only.patchDownload
From 74cad19872e70ffe937ad582fab30bb8f17f7a16 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 30 Sep 2019 11:55:57 +1000
Subject: [PATCH v14 1/3] Enhance libpq target_session_attrs:
read-write/prefer-read/read-only
Improve checking of the requested target session type, to avoid always having to do string
comparisons on target_session_attrs.
Make "transaction_read_only" a GUC_REPORT variable, to avoid having to execute a query
post-connection in order to determine whether a host is read-write (and reduce time to make the
connection).
Add new "prefer-read" target_session_attrs option value, to support connecting to a read-only
server if available from the list of hosts (otherwise connect to a read-write server).
Add new "read-only" target_session_attrs option value, to support connecting to a read-only
server if available from the list of hosts (otherwise the connection attempt fails).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 54 ++++++--
doc/src/sgml/protocol.sgml | 8 +-
src/backend/utils/misc/guc.c | 2 +-
src/interfaces/libpq/fe-connect.c | 239 +++++++++++++++++++++++++++++-----
src/interfaces/libpq/fe-exec.c | 6 +-
src/interfaces/libpq/libpq-fe.h | 8 ++
src/interfaces/libpq/libpq-int.h | 15 ++-
src/test/recovery/t/001_stream_rep.pl | 22 +++-
8 files changed, 299 insertions(+), 55 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index c58527b..0d3edfc 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1674,18 +1674,46 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
If this parameter is set to <literal>read-write</literal>, only a
connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-read</literal>, a
+ connection in which read-only transactions are accepted by default
+ is preferred. If no such connections can be found, then a connection
+ in which read-write transactions are accepted will be considered.
+ </para>
+
+ <para>
+ To determine whether the server supports read-write transactions, the
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any
+ successful connection, if the server is prior to version 13; if it returns
+ <literal>on</literal>, it means the server doesn't support read-write
+ transactions.
+ If the server is version 13 or greater, the support of read-write
+ transactions is determined by the value of the
+ <varname>transaction_read_only</varname> configuration parameter that is
+ reported by the server upon successful connection.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default is considered
+ acceptable.
+ </para>
</listitem>
- </varlistentry>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -1993,14 +2021,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 8027521..87b95bc 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELECT 1/0;
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>, and
+ <varname>transaction_read_only</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 2178e1c..c64ec03 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1543,7 +1543,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..7058f5d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -350,7 +350,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1327,8 +1327,15 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ else if (strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_WRITE;
+ else if (strcmp(conn->target_session_attrs, "prefer-read") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_READ;
+ else if (strcmp(conn->target_session_attrs, "read-only") == 0)
+ conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -2261,13 +2268,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->read_write_host_index >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to read-only servers
+ * and should now try to connect to a read-write server again.
+ */
+ conn->whichhost = conn->read_write_host_index;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->read_write_host_index = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3474,38 +3499,139 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a read-write, prefer-read or read-only connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
+ {
+ if (conn->sversion < 130000)
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn,
+ "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_WRITABLE;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->transaction_read_only &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
+ {
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_target_connection;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that are
+ * default read-only.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->read_write_host_index == -2)
+ goto target_accept_connection;
+ /* Close connection politely. */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
- {
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
- }
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3604,11 +3730,35 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested mode is read-write,
+ * ignore it. Server is read-write and requested mode is
+ * prefer-read, record it for the first time and try to
+ * consume in the next scan (it means no read-only server
+ * is found in the first scan). Server is read-write and
+ * requested mode is read-only, ignore this connection.
+ */
+ if ((readonly_server &&
+ conn->requested_session_type == SESSION_TYPE_READ_WRITE) ||
+ (!readonly_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-read mode for the next pass of the list of
+ * connections as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->read_write_host_index == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
@@ -3621,16 +3771,28 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_host_index == -1)
+ conn->read_write_host_index = conn->whichhost;
+
/*
* Try next host if any, but we don't want to consider
* additional addresses for this host.
@@ -3639,7 +3801,8 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3817,6 +3980,7 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
@@ -3824,6 +3988,9 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_session_type = SESSION_TYPE_ANY;
+ conn->read_write_host_index = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index b3c59a0..3c17100 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1059,7 +1059,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1113,6 +1113,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 5f65db3..aa6f22f 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -73,6 +73,14 @@ typedef enum
typedef enum
{
+ SESSION_TYPE_ANY = 0, /* Any session (default) */
+ SESSION_TYPE_READ_WRITE, /* Read-write session */
+ SESSION_TYPE_PREFER_READ, /* Prefer read only session */
+ SESSION_TYPE_READ_ONLY /* Read only session */
+} TargetSessionAttrsType;
+
+typedef enum
+{
PGRES_POLLING_FAILED = 0,
PGRES_POLLING_READING, /* These two indicate that one may */
PGRES_POLLING_WRITING, /* use select before polling again. */
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 64468ab..eb687d8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -367,8 +367,12 @@ struct pg_conn
char *krbsrvname; /* Kerberos service name */
#endif
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: any, read-write,
+ * prefer-read and read-only.
+ */
char *target_session_attrs;
+ TargetSessionAttrsType requested_session_type;
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -403,6 +407,14 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * First read-write host index in the connection string.
+ *
+ * Initial value is -1, then the index of the first read-write host, -2
+ * during the second attempt of connection to avoid recursion.
+ */
+ int read_write_host_index;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -433,6 +445,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 3c743d7..ac1e11e 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 32;
+use Test::More tests => 37;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,26 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to standby1 in "prefer-read" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1, "prefer-read",
+ 0);
+
+# Connect to standby1 in "prefer-read" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-read", 0);
+
+# Connect to node_master in "prefer-read" mode with only master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-read", 0);
+
+# Connect to standby1 in "read-only" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "read-only", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
v14-0002-libpq-target_session_attrs-primary-prefer_standby-standby.patchapplication/octet-stream; name=v14-0002-libpq-target_session_attrs-primary-prefer_standby-standby.patchDownload
From cb2b1b9d07cce3cb13d9ed2276b683466d6a73fa Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 30 Sep 2019 13:23:57 +1000
Subject: [PATCH v14 2/3] Enhance libpq target_session_attrs:
primary/prefer-standby/standby
Add new "primary" target_session_attrs option value, to support connecting to a server which is not
in recovery mode, if available from the list of hosts (otherwise the connection attempt fails).
Add new "prefer-standby" target_session_attrs option value, to support connecting to a server which
is in recovery mode, if available from the list of hosts (otherwise connect to a server which is
not in recovery mode).
Add new "standby" target_session_attrs option value, to support connecting to a server which is in
recovery mode, if available from the list of hosts (otherwise the connection attempt fails).
To determine if running in recovery mode, the server is sent the query 'SELECT pg_is_in_recovery()'.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 26 ++-
src/interfaces/libpq/fe-connect.c | 313 ++++++++++++++++++++++++++--------
src/interfaces/libpq/libpq-fe.h | 10 +-
src/interfaces/libpq/libpq-int.h | 4 +-
src/test/recovery/t/001_stream_rep.pl | 18 +-
5 files changed, 291 insertions(+), 80 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 0d3edfc..5f31fd0 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1675,7 +1675,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<listitem>
<para>
The supported options for this parameter are <literal>any</literal>,
- <literal>read-write</literal>, <literal>prefer-read</literal> and <literal>read-only</literal>.
+ <literal>read-write</literal>, <literal>prefer-read</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>prefer-standby</literal> and <literal>standby</literal>.
The default value of this parameter, <literal>any</literal>, regards
all connections as acceptable. If multiple hosts are specified in the
connection string, each host is tried in the order given until a connection
@@ -1712,6 +1713,29 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
in which read-only transactions are accepted by default is considered
acceptable.
</para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, only a connection in which
+ the server is not in recovery mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable.
+ </para>
+
+ <para>
+ To determine whether the server is in recovery mode, the query
+ <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful connection;
+ if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
</listitem>
</varlistentry>
</variablelist>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7058f5d..0c096d1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -350,7 +350,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 12, /* sizeof("prefer-read") = 12 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1335,6 +1335,12 @@ connectOptions2(PGconn *conn)
conn->requested_session_type = SESSION_TYPE_PREFER_READ;
else if (strcmp(conn->target_session_attrs, "read-only") == 0)
conn->requested_session_type = SESSION_TYPE_READ_ONLY;
+ else if (strcmp(conn->target_session_attrs, "primary") == 0)
+ conn->requested_session_type = SESSION_TYPE_PRIMARY;
+ else if (strcmp(conn->target_session_attrs, "prefer-standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_PREFER_STANDBY;
+ else if (strcmp(conn->target_session_attrs, "standby") == 0)
+ conn->requested_session_type = SESSION_TYPE_STANDBY;
else
{
conn->status = CONNECTION_BAD;
@@ -2147,6 +2153,60 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested session type. The connection state is set to
+ * try the next host (if any).
+ * In the case of SESSION_TYPE_PREFER_READ, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a read-only host could be made after the first host scan.
+ */
+static void
+reject_checked_read_or_write_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2229,6 +2289,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2268,19 +2329,19 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- if (conn->read_write_host_index >= 0)
+ if (conn->read_write_or_primary_host_index >= 0)
{
/*
* Getting here means we failed to connect to read-only servers
* and should now try to connect to a read-write server again.
*/
- conn->whichhost = conn->read_write_host_index;
+ conn->whichhost = conn->read_write_or_primary_host_index;
/*
* Reset the host index value to avoid recursion during the
* second connection attempt.
*/
- conn->read_write_host_index = -2;
+ conn->read_write_or_primary_host_index = -2;
}
else
{
@@ -3507,7 +3568,9 @@ keep_going: /* We will come back to here until there is
* may just skip the test in that case.
*/
if (conn->sversion >= 70400 &&
- conn->requested_session_type != SESSION_TYPE_ANY)
+ (conn->requested_session_type == SESSION_TYPE_READ_WRITE ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY))
{
if (conn->sversion < 130000)
{
@@ -3540,55 +3603,16 @@ keep_going: /* We will come back to here until there is
(conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
conn->requested_session_type == SESSION_TYPE_READ_ONLY)))
{
- /* Not a requested type; fail this connection. */
- const char *displayed_host;
- const char *displayed_port;
-
/*
* The following scenario is possible only for the
* prefer-read mode for the next pass of the list of
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_target_connection;
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_read_or_write_connection(conn);
goto keep_going;
}
@@ -3597,30 +3621,70 @@ keep_going: /* We will come back to here until there is
}
/*
- * Requested type is prefer-read, then record this host index
- * and try the other before considering it later. If requested
- * type of connection is read-only, ignore this connection.
+ * Servers before 9.0 don't support recovery, skip the check
+ * when the requested type of connection is primary,
+ * prefer-standby or standby.
+ */
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_session_type == SESSION_TYPE_PRIMARY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have tried
+ * and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ /*
+ * Requested type is prefer-read or prefer-standby, then
+ * record this host index and try the other before considering
+ * it later. If requested type of connection is read-only or
+ * standby, ignore this connection.
*/
+
if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
- conn->requested_session_type == SESSION_TYPE_READ_ONLY)
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)
{
/*
* The following scenario is possible only for the
- * prefer-read mode for the next pass of the list of
- * connections as it couldn't find any servers that are
- * default read-only.
+ * prefer-read or prefer-standby mode for the next pass of
+ * the list of connections as it couldn't find any servers
+ * that are default read-only or in recovery mode.
*/
- if (conn->read_write_host_index == -2)
- goto target_accept_connection;
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
/* Close connection politely. */
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
/* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY)
+ {
+ if (conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+ }
/*
* Try next host if any, but we don't want to consider
@@ -3755,13 +3819,119 @@ keep_going: /* We will come back to here until there is
* connections as it couldn't find any servers that
* are default read-only.
*/
- if (conn->read_write_host_index == -2)
+ if (conn->read_write_or_primary_host_index == -2)
goto consume_checked_write_connection;
/* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
+ reject_checked_read_or_write_connection(conn);
+ goto keep_going;
+ }
+
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
+
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested mode is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server is found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
/* Append error report to conn->errorMessage. */
if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
displayed_host = conn->connhost[conn->whichhost].hostaddr;
@@ -3771,16 +3941,14 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
- if (conn->requested_session_type == SESSION_TYPE_READ_WRITE)
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
+ libpq_gettext("server is in recovery mode "
"\"%s:%s\"\n"),
displayed_host, displayed_port);
else
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a readonly "
- "connection to server "
+ libpq_gettext("server is not in recovery mode "
"\"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3788,10 +3956,10 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
sendTerminateConn(conn);
- /* Record read-write host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_READ &&
- conn->read_write_host_index == -1)
- conn->read_write_host_index = conn->whichhost;
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
/*
* Try next host if any, but we don't want to consider
@@ -3801,7 +3969,7 @@ keep_going: /* We will come back to here until there is
goto keep_going;
}
- consume_checked_write_connection:
+ consume_checked_recovery_connection:
/* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3815,7 +3983,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3831,7 +3999,7 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
"on server \"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3843,7 +4011,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3989,7 +4156,7 @@ makeEmptyPGconn(void)
#endif
conn->requested_session_type = SESSION_TYPE_ANY;
- conn->read_write_host_index = -1;
+ conn->read_write_or_primary_host_index = -1;
/*
* We try to send at least 8K at a time, which is the usual size of pipe
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index aa6f22f..30b181a 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
@@ -76,8 +77,11 @@ typedef enum
SESSION_TYPE_ANY = 0, /* Any session (default) */
SESSION_TYPE_READ_WRITE, /* Read-write session */
SESSION_TYPE_PREFER_READ, /* Prefer read only session */
- SESSION_TYPE_READ_ONLY /* Read only session */
-} TargetSessionAttrsType;
+ SESSION_TYPE_READ_ONLY, /* Read only session */
+ SESSION_TYPE_PRIMARY, /* Primary server */
+ SESSION_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SESSION_TYPE_STANDBY /* Standby server */
+} TargetSessionAttrsType;
typedef enum
{
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index eb687d8..0efe16f 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -369,7 +369,7 @@ struct pg_conn
/*
* Type of connection to make. Possible values: any, read-write,
- * prefer-read and read-only.
+ * prefer-read, read-only, primary, prefer-standby and standby.
*/
char *target_session_attrs;
TargetSessionAttrsType requested_session_type;
@@ -413,7 +413,7 @@ struct pg_conn
* Initial value is -1, then the index of the first read-write host, -2
* during the second attempt of connection to avoid recursion.
*/
- int read_write_host_index;
+ int read_write_or_primary_host_index;
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index ac1e11e..8fa28da 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 37;
+use Test::More tests => 41;
# Initialize master node
my $node_master = get_new_node('master');
@@ -141,6 +141,22 @@ test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"read-only", 0);
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
v14-0003-Server-recovery-mode-handling.patchapplication/octet-stream; name=v14-0003-Server-recovery-mode-handling.patchDownload
From 6ce04730b7fcc6fa532cbae3a2c149e3da6ae03c Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 30 Sep 2019 15:45:46 +1000
Subject: [PATCH v14 3/3] Server recovery mode handling
Add "in_recovery" as a GUC_REPORT variable, to update clients when the
server is in recovery mode. This improves the speed of client connections
to a standby server, by avoiding the need to execute a command to
determine if the server is in recovery mode.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode
exit to all backends and their respective clients.
Some parts of the code is taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 16 ++--
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 +++++++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +++
src/backend/tcop/postgres.c | 60 +++++++++++++++
src/backend/utils/init/postinit.c | 6 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 16 ++++
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 142 +++++++++++++++++++++++------------
src/interfaces/libpq/fe-exec.c | 4 +
src/interfaces/libpq/libpq-int.h | 1 +
17 files changed, 245 insertions(+), 59 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 5f31fd0..9f66fb8 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1732,8 +1732,12 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<para>
To determine whether the server is in recovery mode, the query
- <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful connection;
- if it returns <literal>t</literal>, it means the server is in recovery mode.
+ <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful connection
+ if the server is prior to version 13; if it returns <literal>t</literal>, it means the server
+ is in recovery mode.
+ If the server is version 13 or greater, the recovery mode state is determined by the value of
+ the <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection.
</para>
</listitem>
@@ -2046,15 +2050,17 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 87b95bc..df3953f 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1284,15 +1284,17 @@ SELECT 1/0;
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
<varname>integer_datetimes</varname>,
- <varname>standard_conforming_strings</varname>, and
- <varname>transaction_read_only</varname>.
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before 9.0;
- <varname>transaction_read_only</varname> was not reported by releases before 13.0.)
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 13.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6c69eb6..4fe506e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7770,6 +7770,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryInProgress = false;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 8abcfdf..744475c 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2971,6 +2971,34 @@ CountOtherDBBackends(Oid databaseId, int *nbackends, int *nprepared)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7605b2c..e4548dc 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -292,6 +292,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 01ddffe..b0e88ee 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -138,6 +138,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e8d8e6f..69ce3ec 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -167,6 +167,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -195,6 +204,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -543,6 +553,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -2962,6 +2976,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 29c5ec7..59fb4e9 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -649,7 +649,11 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
* This is handled by calling RecoveryInProgress and ignoring the
* result.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index c64ec03..633bc2d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -580,6 +580,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1770,6 +1771,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_POSTMASTER, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index da8b672..86f0c13 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void XidCacheRemoveRunningXids(TransactionId xid,
int nxids, const TransactionId *xids,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 05b186a..9cf9560 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index a3f8f82..2c73f0c 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index ec21f7e..ed21a9e 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -66,6 +66,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 0c096d1..7a04618 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2207,6 +2207,58 @@ reject_checked_read_or_write_connection(PGconn *conn)
conn->try_next_host = true;
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested session type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SESSION_TYPE_PREFER_STANDBY, if the read-write-or-primary host-index
+ * hasn't been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+reject_checked_recovery_connection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
+ conn->read_write_or_primary_host_index == -1)
+ conn->read_write_or_primary_host_index = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -3630,27 +3682,52 @@ keep_going: /* We will come back to here until there is
conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
conn->requested_session_type == SESSION_TYPE_STANDBY)))
{
- /*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
- */
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ if (conn->sversion < 130000)
{
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_RECOVERY;
+
restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ return PGRES_POLLING_READING;
}
+ else if ((conn->in_recovery &&
+ conn->requested_session_type == SESSION_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY ||
+ conn->requested_session_type == SESSION_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->read_write_or_primary_host_index == -2)
+ goto consume_checked_target_connection;
- conn->status = CONNECTION_CHECK_RECOVERY;
+ reject_checked_recovery_connection(conn);
+ goto keep_going;
+ }
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
}
/*
@@ -3932,40 +4009,7 @@ keep_going: /* We will come back to here until there is
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- if (conn->requested_session_type == SESSION_TYPE_PRIMARY)
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
- else
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("server is not in recovery mode "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /* Record primary host index */
- if (conn->requested_session_type == SESSION_TYPE_PREFER_STANDBY &&
- conn->read_write_or_primary_host_index == -1)
- conn->read_write_or_primary_host_index = conn->whichhost;
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ reject_checked_recovery_connection(conn);
goto keep_going;
}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 3c17100..a964483 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1117,6 +1117,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
{
conn->transaction_read_only = (strcmp(value, "on") == 0);
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 0efe16f..327dd04 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -446,6 +446,7 @@ struct pg_conn
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
--
1.8.3.1
From: Greg Nancarrow <gregn4422@gmail.com>
With the permission of the original patch author, Haribabu Kommi, I’ve
rationalized the existing 8 patches into 3 patches, merging patches
1-5 and 6-7, and tidying up some documentation and code comments. I
also rebased them to the latest PG12 source code (as of October 1,
2019). The patch code itself is the same, except for some version
checks that I have updated to target the features for PG13 instead of
PG12.
I’ve attached the updated patches.
Thank you for taking over this patch. Your arrangement has made the patches much easier to read!
I've finished reviewing, and my comments are below. Unfortunately, 0003 failed to apply (I guess only slight modification is needed to apply to HEAD.) I'd like to proceed to testing when the revised patch becomes available.
(1) 0001
+ /*
+ * Requested type is prefer-read, then record this host index
+ * and try the other before considering it later. If requested
+ * type of connection is read-only, ignore this connection.
+ */
+ if (conn->requested_session_type == SESSION_TYPE_PREFER_READ ||
+ conn->requested_session_type == SESSION_TYPE_READ_ONLY)
{
This if statement seems unnecessary, because the following part at the beginning of the CONNECTION_CHECK_TARGET case block precludes entering the if block. Cases other than "any" are handled first here.
if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ conn->requested_session_type != SESSION_TYPE_ANY)
+ {
(2) 0002
-} TargetSessionAttrsType;
+} TargetSessionAttrsType;
One space after } is replaced with three tabs. I guess this is an unintentional change.
(3) 0002
+reject_checked_read_or_write_connection(PGconn *conn)
To follow the naming style of most internal functions in this file, I find it better to change the name to rejectCheckedReadOrWriteConnection.
(4) 0003
+reject_checked_recovery_connection(PGconn *conn)
The same as the previous one.
(5) 0003
Don't we have to describe in_recovery in both or either of high-availability.sgml and config.sgml? transaction_read_only is touched in the former.
Regards
Takayuki Tsunakawa
On 2019-Oct-01, Greg Nancarrow wrote:
On Wed, Sep 11, 2019 at 10:17 AM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:Oh, oops. Here they are then.
With the permission of the original patch author, Haribabu Kommi, I’ve
rationalized the existing 8 patches into 3 patches, merging patches
1-5 and 6-7, and tidying up some documentation and code comments. I
also rebased them to the latest PG12 source code (as of October 1,
2019). The patch code itself is the same, except for some version
checks that I have updated to target the features for PG13 instead of
PG12.
I've spent some time the last few days going over these patches and the
prior discussion.
I'm not sure I understand why we end up with "prefer-read" in addition
to "prefer-standby" (and similar seeming redundancy between "primary"
and "read-write"). Do we really need more than one way to identify
hosts' roles? It seems 0001 adds the "prefer-read" modes by checking
transaction_read_only, and later 0002 adds the "prefer-standby" modes by
checking in_recovery. I'm not sure that we're serving our users very
well by giving them choice that ends up being confusing. In other words
I think we should do only one of these things, not both. Maybe merge
0001 and 0002 in a single patch, and get rid of redundant modes.
There were other comments that I think went largely unaddressed, such as
the point that the JDBC driver seems to offer a different syntax for the
configuration, and should we offer a compatibility shim of some sort.
(Frankly, I don't think we need to stress over this too much, but it
seems that it wasn't even discussed.)
0003 contains parts written by Elvis Pranskevichus. It would be good to
confirm that he is satisfied with how the whole thing ends up working.
Also, Ishii-san said:
/messages/by-id/20190116.150236.2304777214520289427.t-ishii@sraoss.co.jp
- When looking for a primary, find a node where pg_is_in_recovery is
false; if none, libpq should retry until a timeout expires. Did we
reject this idea altogether, or is it just unimplemented?
Looking at 0001, I would move the new "desired connection mode" to
libpq-int.h (from libpq-fe.h), and rename like this
/* Desired connection type */
typedef enum
{
TGT_CONN_TYPE_ANY = 0, /* Any session (default) */
TGT_CONN_TYPE_READ_WRITE, /* Read-write session */
TGT_CONN_TYPE_PREFER_READ, /* Prefer read only session */
TGT_CONN_TYPE_READ_ONLY /* Read only session */
} TargetConnectionType;
The name of the label "consume_checked_write_connection" is not very
descriptive. I propose "conn_succeeded" instead.
"read_write_host_index" seems a very unimaginative struct member name.
Following "whichhost" I propose to rename this to "which_rw_host", and
rewrite its comment to something like this:
/*
* Status indicator for read-write host. The initial value of -1
* indicates that we don't know which server is the read-write one; a
* non-negative number (set as soon as we discover one) indicates which
* server is the read-write one; -2 indicates that the server being tested
* (whichhost???) is the read-write one.
*/
int which_rw_host;
(I'm not sure that the explanation for value -2 is correct. Please
rewrite that if it isn't.)
I think the if/then/else maze in the CONNECTION_CHECK_TARGET case in
PQconnectPoll() is a nigh unreadable rat's nest after these patches.
Maybe some extra states in the state machine are needed; and probably
that would be helped by some small subroutines to reduce the
duplication. PQconnectPoll is already 1700 lines long; our job is not
made easier by making it 2000 lines long.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2019-Dec-26, Alvaro Herrera wrote:
The name of the label "consume_checked_write_connection" is not very
descriptive. I propose "conn_succeeded" instead.
(I realized later that I should have removed this paragraph -- other
goto labels are added in 0002 that would make such renaming more
confusing than helpful. My later comment about the if/else/then maze is
more general.)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 26 Dec 2019 at 15:07, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
On 2019-Oct-01, Greg Nancarrow wrote:
On Wed, Sep 11, 2019 at 10:17 AM Alvaro Herrera from 2ndQuadrant
<alvherre@alvh.no-ip.org> wrote:Oh, oops. Here they are then.
With the permission of the original patch author, Haribabu Kommi, I’ve
rationalized the existing 8 patches into 3 patches, merging patches
1-5 and 6-7, and tidying up some documentation and code comments. I
also rebased them to the latest PG12 source code (as of October 1,
2019). The patch code itself is the same, except for some version
checks that I have updated to target the features for PG13 instead of
PG12.I've spent some time the last few days going over these patches and the
prior discussion.I'm not sure I understand why we end up with "prefer-read" in addition
to "prefer-standby" (and similar seeming redundancy between "primary"
and "read-write"). Do we really need more than one way to identify
hosts' roles? It seems 0001 adds the "prefer-read" modes by checking
transaction_read_only, and later 0002 adds the "prefer-standby" modes by
checking in_recovery. I'm not sure that we're serving our users very
well by giving them choice that ends up being confusing. In other words
I think we should do only one of these things, not both. Maybe merge
0001 and 0002 in a single patch, and get rid of redundant modes.There were other comments that I think went largely unaddressed, such as
the point that the JDBC driver seems to offer a different syntax for the
configuration, and should we offer a compatibility shim of some sort.
(Frankly, I don't think we need to stress over this too much, but it
seems that it wasn't even discussed.)
We seem to ignore prior work here I agree. It would be wonderful if there
were only one
syntax. Is it too late to change the syntax for this patch as that ship has
sailed for JDBC
Show quoted text
On 2019-Dec-26, Dave Cramer wrote:
On Thu, 26 Dec 2019 at 15:07, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:
There were other comments that I think went largely unaddressed,
such as the point that the JDBC driver seems to offer a different
syntax for the configuration, and should we offer a compatibility
shim of some sort. (Frankly, I don't think we need to stress over
this too much, but it seems that it wasn't even discussed.)We seem to ignore prior work here I agree. It would be wonderful if
there were only one syntax. Is it too late to change the syntax for
this patch as that ship has sailed for JDBC
So, starting with pg10 we have target_session_attrs in libpq. These
patches just add some more "attrs" that can be requested for a session.
Tom's proposal[1]/messages/by-id/26251.1547504236@sss.pgh.pa.us was to rename the conninfo option to match JDBC's
targetServerType, adding a compatibility mechanism so that libpq's
target_session_attrs continues to work for values "any" and
"read-write"; but we already discussed all this with regards to the
pgjdbc param names and we still decided not to use them[2]/messages/by-id/CAHg_5grVKbO73CqKNYsCYsX5aJ=deDSAyW44wjmwt1uqngScdQ@mail.gmail.com (ending as
commit 721f7bd3cbcc).
Maybe y'all want to relitigate this for some reason. I can help with
getting an implementation finished once y'all are done with the
politics.
[1]: /messages/by-id/26251.1547504236@sss.pgh.pa.us
[2]: /messages/by-id/CAHg_5grVKbO73CqKNYsCYsX5aJ=deDSAyW44wjmwt1uqngScdQ@mail.gmail.com
(If we do want to match pgJDBC's option name, then I suppose we need to
add a synonym mechanism to libpq's option parsing. That doesn't look
particularly difficult, and it would probably help clean up the mess
that we currently track both the "char *" value of the option as well as
a separate enum value for it, in the pgconn struct.)
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: Alvaro Herrera <alvherre@2ndquadrant.com>
I'm not sure I understand why we end up with "prefer-read" in addition
to "prefer-standby" (and similar seeming redundancy between "primary"
and "read-write"). Do we really need more than one way to identify
hosts' roles? It seems 0001 adds the "prefer-read" modes by checking
transaction_read_only, and later 0002 adds the "prefer-standby" modes by
checking in_recovery. I'm not sure that we're serving our users very
well by giving them choice that ends up being confusing. In other words
I think we should do only one of these things, not both. Maybe merge
0001 and 0002 in a single patch, and get rid of redundant modes.
That's because the distinction read/write is different from primary/standby. If default_transaction_read_only is on, even the primary is read-only. That's why the syntax target_session_attrs = {read-write | read-only} was introduced instead of target_server_type = {primary | standby}. Personally, I only want target_server_type = {primary | standby | prefer-standby}, and discard target_session_attrs for simplicity of the functional specification and the code.
Also, Ishii-san said:
/messages/by-id/20190116.150236.2304777214520289427.t-ishii@sraoss.c
o.jp
- When looking for a primary, find a node where pg_is_in_recovery is
false; if none, libpq should retry until a timeout expires. Did we
reject this idea altogether, or is it just unimplemented?
I don't remember well, but I guess this is for eliminating the need for applications to retry connection attempts during the database server failover. I think that will be convenient, but not mandatory for this patch. PgJDBC doesn't provide it, either.
Regards
Takayuki Tsunakawa
On 2019-Dec-27, tsunakawa.takay@fujitsu.com wrote:
From: Alvaro Herrera <alvherre@2ndquadrant.com>
I'm not sure I understand why we end up with "prefer-read" in addition
to "prefer-standby" (and similar seeming redundancy between "primary"
and "read-write"). Do we really need more than one way to identify
hosts' roles? It seems 0001 adds the "prefer-read" modes by checking
transaction_read_only, and later 0002 adds the "prefer-standby" modes by
checking in_recovery. I'm not sure that we're serving our users very
well by giving them choice that ends up being confusing. In other words
I think we should do only one of these things, not both. Maybe merge
0001 and 0002 in a single patch, and get rid of redundant modes.That's because the distinction read/write is different from
primary/standby. If default_transaction_read_only is on, even the
primary is read-only. That's why the syntax target_session_attrs =
{read-write | read-only} was introduced instead of target_server_type
= {primary | standby}. Personally, I only want target_server_type =
{primary | standby | prefer-standby}, and discard target_session_attrs
for simplicity of the functional specification and the code.
So, we can know whether server is primary/standby by checking
in_recovery, as opposed to knowing whether read-write which is done by
checking transaction_read_only. So we can keep read-write as a synonym
for "primary", and check in_recovery when used in servers that support
the new GUC, and check transaction_read_only in older servers.
It seems there's a lot of code that we can discard from the patch:
first, we can discard checking for "read-only" altogether. Second, have
us check transaction_read_only *only* if the server is of an older
version.
I would discard the whole thing about checking "SELECT pg_is_in_recovery()"
also; let's skip straight to checking SHOW in_recovery (patch 0003).
Let's not introduce a mechanism that ends up obsolete immediately.
By the same token, I propose we don't mark transaction_read_only as a
GUC_REPORT option, since we only do that to let it become obsolete
immediately. If we connect to a server older than 13, just keep sending
the SHOW query.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: Alvaro Herrera <alvherre@2ndquadrant.com>
So, we can know whether server is primary/standby by checking
in_recovery, as opposed to knowing whether read-write which is done by
checking transaction_read_only. So we can keep read-write as a synonym
for "primary", and check in_recovery when used in servers that support
the new GUC, and check transaction_read_only in older servers.It seems there's a lot of code that we can discard from the patch:
first, we can discard checking for "read-only" altogether. Second, have
us check transaction_read_only *only* if the server is of an older
version.
Let me check my understanding. Are you proposing these?
* The canonical libpq connection parameter is target_session_attr = {primary | standby | prefer-standby}. Leave and document read-write as a synonym for primary.
* When the server version is 13 or later, libpq just checks in_recovery, not checking transaction_read_only or sending SHOW transaction_read_only.
* When the server version is before 13, libpq sends SHOW transaction_read_only as before.
Personally, 100% agreed, considering what we really wanted to do when target_session_attr was introduced is to tell if the server is primary or standby. The questions are:
Q1: Should we continue to use the name target_session_attr, or rename it to target_server_type and make target_session_attr a synonym for it? I'm in favor of the latter.
Q2: Can we accept the subtle incompatibility that target_session_attr=read-write and target_server_type=primary are not the same, when default_transaction_read_only is on? (I'd like to hear yes)
Q3: Can we go without supporting standby and prefer-standby for older servers? (I think yes because we can say that it's a new feature effective for new servers.)
Regards
Takayuki Tsunakawa
On 2020-Jan-06, tsunakawa.takay@fujitsu.com wrote:
Let me check my understanding. Are you proposing these?
* The canonical libpq connection parameter is target_session_attr = {primary | standby | prefer-standby}. Leave and document read-write as a synonym for primary.
* When the server version is 13 or later, libpq just checks in_recovery, not checking transaction_read_only or sending SHOW transaction_read_only.
* When the server version is before 13, libpq sends SHOW transaction_read_only as before.
Yes, that sounds good to me.
Personally, 100% agreed, considering what we really wanted to do when target_session_attr was introduced is to tell if the server is primary or standby. The questions are:
Q1: Should we continue to use the name target_session_attr, or rename it to target_server_type and make target_session_attr a synonym for it? I'm in favor of the latter.
I'm not 100% sure about this. I think part of the reason of making it
target_session_attrs (note plural) is that the user could be able to
specify more than one attribute (a comma-separated list, like the
DateStyle GUC), if we supported some hypothetical attributes in the
future that are independent of the existing ones. I'm not inclined to
break that, unless the authors of the original feature agree to that.
Maybe one possible improvement would be to add target_server_type as an
additional one, that only accepts a single item (primary/standby/prefer-standby),
as a convenience, while target_session_attrs retains its ability to
receive more than one value. The two would be somewhat redundant but
not exact synonyms.
Q2: Can we accept the subtle incompatibility that
target_session_attr=read-write and target_server_type=primary are not
the same, when default_transaction_read_only is on? (I'd like to hear
yes)
... on servers versions 12 and older, yes. (If I understand correctly,
we wouldn't have such a difference in version 13).
Q3: Can we go without supporting standby and prefer-standby for older
servers? (I think yes because we can say that it's a new feature
effective for new servers.)
Yes.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
MauMau, Greg, is any of you submitting a new patch for this?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2/28/20 11:05 AM, Alvaro Herrera wrote:
MauMau, Greg, is any of you submitting a new patch for this?
This patch has not had any updates in months and now we are halfway
through the CF so I have marked it Returned with Feedback.
If a patch arrives soon I'll be happy to revive the entry, otherwise
please submit to a future CF when a new patch is available.
Regards,
--
-David
david@pgmasters.net
Hi Hackers,
I'd like to submit a new version of a patch that I'd previously
submitted but was eventually Returned with Feedback (closed in
commitfest 2020-03).
The patch enhances the libpq "target_session_attrs" connection
parameter by supporting primary/standby/prefer-standby, and I've
attempted some sort of alignment with similar PGJDBC driver
functionality by adding a "target_server_type" parameter. Now targets
PG14.
I've merged the original set of 3 patches into one patch and tried to
account for most(?) of the requested changes in the feedback comments;
if nothing else, it should be easier to read and understand.
Previous discussion here:
/messages/by-id/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v15-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchapplication/octet-stream; name=v15-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchDownload
From 7ddf48f99db63538cdd91a49fe2c3d3c96b66e6f Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Mon, 18 May 2020 15:08:17 +1000
Subject: [PATCH v15] Enhance libpq target_session_attrs and add
target_server_type.
Enhance the connection parameter "target_session_attrs" to support values
primary/standby/prefer-standby (using the existing "read-write" as a synonym
for "primary"). To provide closer alignment with similar functionality in the
PGJDBC driver, add a new connection parameter "target_server_type".
Add "in_recovery" as a GUC_REPORT variable, to update clients when the server
is in recovery mode. This improves the speed of client connections to a standby
server, by avoiding the need to execute a command to determine if the server is
in recovery mode.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode exit
to all backends and their respective clients.
Some parts of the code are taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
contrib/postgres_fdw/expected/postgres_fdw.out | 2 +-
doc/src/sgml/high-availability.sgml | 5 +-
doc/src/sgml/libpq.sgml | 111 +++++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 ++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +
src/backend/tcop/postgres.c | 59 ++++
src/backend/utils/init/postinit.c | 9 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 16 +
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 446 ++++++++++++++++++++++---
src/interfaces/libpq/fe-exec.c | 6 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 48 ++-
src/test/recovery/t/001_stream_rep.pl | 146 +++++++-
21 files changed, 830 insertions(+), 80 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 90db550..ba0ba6e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8898,7 +8898,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, target_server_type, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 44cc5d2..6bc081b 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1885,8 +1885,9 @@ if (!triggered)
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
+ During hot standby, the parameters <varname>in_recovery</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify
the database, connections during hot standby will act much like any other
database connection. If failover or switchover occurs, the database will
switch to normal processing mode. Sessions will remain connected while the
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 52631f4..8409ecc 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1811,18 +1811,89 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ <literal>primary</literal> may alternatively be specified as <literal>read-write</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
+ <varlistentry id="libpq-connect-target-server-type" xreflabel="target_server_type">
+ <term><literal>target_server_type</literal></term>
+ <listitem>
+ <para>
+ The supported options for this parameter are <literal>primary</literal>,
+ <literal>standby</literal> and <literal>prefer-standby</literal>.
+ <literal>primary</literal> may alternatively be specified as <literal>read-write</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ This parameter overrides any connection type specified by <literal>target_session_attrs</literal>.
+ If multiple hosts are specified in the connection string, each host is tried in the order given
+ until a connection is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -2130,14 +2201,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>in_recovery</varname> was not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
@@ -7237,6 +7310,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ <indexterm>
+ <primary><envar>PGTARGETSERVERTYPE</envar></primary>
+ </indexterm>
+ <envar>PGTARGETSERVERTYPE</envar> behaves the same as the <xref
+ linkend="libpq-connect-target-server-type"/> connection parameter.
+ </para>
+ </listitem>
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 20d1fe0..053a3ca 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>in_recovery</varname> was not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index ca09d81..dbb3bd6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7940,6 +7940,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 3c2b369..e6e63c6 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3082,6 +3082,34 @@ TerminateOtherDBBackends(Oid databaseId)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 7b0c6ff..46a4dda 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -564,6 +564,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
if (CheckProcSignalBarrier())
{
InterruptPending = true;
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index bdaf10a..2045552 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -140,6 +140,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8958ec8..4d227db 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -162,6 +162,14 @@ static bool UseSemiNewlineNewline = false; /* -j switch */
static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
@@ -191,6 +199,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -539,6 +548,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -3009,6 +3022,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index f4247ea..e8f9fd0 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -646,10 +646,13 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/*
* The postmaster already started the XLOG machinery, but we need to
* call InitXLOGAccess(), if the system isn't in hot-standby mode.
- * This is handled by calling RecoveryInProgress and ignoring the
- * result.
+ * This is handled by calling RecoveryInProgress.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 2f3e0a7..e366b40 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -619,6 +619,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1869,6 +1870,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index a5c7d0c..08b3b04 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void TerminateOtherDBBackends(Oid databaseId);
extern void XidCacheRemoveRunningXids(TransactionId xid,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 90607df..2c0a5f8 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -42,6 +42,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index cfbe426..e5d42ac 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index bd30607..d29dd1f 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -68,6 +68,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index d5da6dc..cf6e79d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -351,9 +351,14 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
+ {"target_server_type", "PGTARGETSERVERTYPE",
+ NULL, NULL,
+ "Target-Server-Type", "", 17, /* sizeof("prefer-secondary") = 17 */
+ offsetof(struct pg_conn, target_server_type)},
+
/* Terminating entry --- MUST BE LAST */
{NULL, NULL, NULL, NULL,
NULL, NULL, 0}
@@ -995,6 +1000,30 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndRecordTargetServerType
+ *
+ * Validate a given target server option value and record the requested server
+ * type. All valid target_server_type option values are also allowed in
+ * target_session_attrs (as a single option value).
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndRecordTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "primary") == 0 || strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "prefer-standby") == 0 || strcmp(optionValue, "prefer-secondary") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0 || strcmp(optionValue, "secondary") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1387,8 +1416,9 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -1399,6 +1429,23 @@ connectOptions2(PGconn *conn)
}
/*
+ * Validate target_server_type option.
+ * If a target_server_type is specified, it overrides any target server
+ * type specified in target_session_attrs.
+ */
+ if (conn->target_server_type)
+ {
+ if (!validateAndRecordTargetServerType(conn->target_server_type, &conn->requested_server_type))
+ {
+ conn->status = CONNECTION_BAD;
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("invalid target_server_type value: \"%s\"\n"),
+ conn->target_server_type);
+ return false;
+ }
+ }
+
+ /*
* Only if we get this far is it appropriate to try to connect. (We need a
* state flag, rather than just the boolean result of this function, in
* case someone tries to PQreset() the PGconn.)
@@ -2223,6 +2270,110 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type. The connection state is set to
+ * try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedRecoveryConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2305,6 +2456,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2341,12 +2493,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_rw_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to read-only servers
+ * and should now try to re-connect to a previously-connected-to
+ * read-write server, whose host index is recorded in which_rw_host.
+ */
+ conn->whichhost = conn->which_rw_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_rw_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
+ else
conn->whichhost++;
/* Drop any address info for previous host */
@@ -3556,38 +3727,102 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a primary (not-in-recovery / read-write) connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if ((conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->which_rw_host == -2)
+ goto consume_checked_target_connection;
+
+ rejectCheckedRecoveryConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the Requested type of connection is prefer-standby, then record
+ * this host index and try other specified hosts before considering it later.
+ * If the requested type of connection is standby, ignore this connection.
+ */
+
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->which_rw_host == -2)
+ goto consume_checked_target_connection;
+ /* Close connection politely. */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ if (conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3686,42 +3921,149 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is primary,
+ * ignore it. Server is read-write and requested type is
+ * prefer-standby, record it for the first time and try to
+ * consume in the next scan (it means no standby server
+ * was found in the first scan). Server is read-write and
+ * requested type is standby, ignore this connection.
+ */
+ if ((readonly_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list of
+ * connections, as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->which_rw_host == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested type is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested type is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
*/
- conn->try_next_host = true;
+ if (conn->which_rw_host == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ rejectCheckedRecoveryConnection(conn);
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3734,7 +4076,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3750,7 +4092,7 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
"on server \"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3762,7 +4104,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3906,6 +4247,9 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_rw_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237..be73aa7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1058,7 +1058,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1112,6 +1112,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 3b6a9fb..9c59feb 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1de91ae..8033fc0 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,15 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +379,33 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values:
+ * "any"
+ * "primary" (or "read-write")
+ * "prefer-standby" (or "prefer-secondary")
+ * "standby" (or "secondary").
+ */
char *target_session_attrs;
+ /*
+ * Type of server to connect to. Possible values:
+ * "primary" (or "read-write")
+ * "prefer-standby" (or "prefer-secondary")
+ * "standby" (or "secondary").
+ * This overrides any connection type specified by target_session_attrs.
+ * This option is almost a synonym for the target_session_attrs option, except
+ * its purpose is to closely reflect the similar PGJDBC targetServerType option.
+ * Note also that this option only accepts single option values, whereas in
+ * future, target_session_attrs may accept multiple session attribute values.
+ */
+ char *target_server_type;
+
+ /*
+ * The requested server type, derived from target_session_attrs / target_server_type.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +439,17 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first read-write host encountered (if any) in the connection string.
+ *
+ * The initial value is -1, indicating that no read-write host has yet been found.
+ * It is then set to the index of the first read-write host, if one is found in the
+ * connection string during processing. If a second connection attempt is later made
+ * to that read-write host, which_rw_host is then set to -2 to avoid recursion during
+ * processing (and whichhost is set to the read-write host index).
+ */
+ int which_rw_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +480,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
@@ -540,7 +585,6 @@ struct pg_cancel
int be_key; /* key of backend --- needed for cancels */
};
-
/* String descriptions of the ExecStatusTypes.
* direct use of this array is deprecated; call PQresStatus() instead.
*/
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 0c316c1..1dc7ba0 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 35;
+use Test::More tests => 61;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,150 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to master in "primary" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_master,
+ "primary", 0);
+
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to master in "prefer-secondary" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-ssecondary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "secondary", 0);
+
+# Tests for connection parameter target_server_type
+note "testing connection parameter \"target_server_type\"";
+
+# Routine designed to run tests on the connection parameter
+# target_server_type with multiple nodes.
+sub test_target_server_type
+{
+ my $node1 = shift;
+ my $node2 = shift;
+ my $target_node = shift;
+ my $mode = shift;
+ my $status = shift;
+
+ my $node1_host = $node1->host;
+ my $node1_port = $node1->port;
+ my $node1_name = $node1->name;
+ my $node2_host = $node2->host;
+ my $node2_port = $node2->port;
+ my $node2_name = $node2->name;
+
+ my $target_name = $target_node->name;
+
+ # Build connection string for connection attempt.
+ my $connstr = "host=$node1_host,$node2_host ";
+ $connstr .= "port=$node1_port,$node2_port ";
+ $connstr .= "target_server_type=$mode";
+
+ # The client used for the connection does not matter, only the backend
+ # point does.
+ my ($ret, $stdout, $stderr) =
+ $node1->psql('postgres', 'SHOW port;',
+ extra_params => [ '-d', $connstr ]);
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+
+ return;
+}
+
+# Connect to master in "read-write" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_master,
+ "read-write", 0);
+
+# Connect to master in "read-write" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_master,
+ "read-write", 0);
+
+# Connect to master in "primary" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_master,
+ "primary", 0);
+
+# Connect to master in "primary" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_server_type($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to master in "prefer-secondary" mode with master,master list.
+test_target_server_type($node_master, $node_master, $node_master,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "secondary", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
On 18 May 2020, at 09:33, Greg Nancarrow <gregn4422@gmail.com> wrote:
I'd like to submit a new version of a patch that I'd previously
submitted but was eventually Returned with Feedback (closed in
commitfest 2020-03).
This patch no longer applies, can you please submit a rebased version? I've
marked the entry as Waiting on Author in the meantime.
cheers ./daniel
This patch no longer applies, can you please submit a rebased version? I've
marked the entry as Waiting on Author in the meantime.
Here's a rebased version of the patch.
Regards,
Greg
Attachments:
v16-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchapplication/octet-stream; name=v16-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchDownload
From 6800792d3c6ff3853b47ed33e54f63de18f4b657 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Fri, 3 Jul 2020 14:44:27 +1000
Subject: [PATCH v16] Enhance libpq target_session_attrs and add
target_server_type.
Enhance the connection parameter "target_session_attrs" to support values
primary/standby/prefer-standby (using the existing "read-write" as a synonym
for "primary"). To provide closer alignment with similar functionality in the
PGJDBC driver, add a new connection parameter "target_server_type".
Add "in_recovery" as a GUC_REPORT variable, to update clients when the server
is in recovery mode. This improves the speed of client connections to a standby
server, by avoiding the need to execute a command to determine if the server is
in recovery mode.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode exit
to all backends and their respective clients.
Some parts of the code are taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
contrib/postgres_fdw/expected/postgres_fdw.out | 2 +-
doc/src/sgml/high-availability.sgml | 5 +-
doc/src/sgml/libpq.sgml | 111 +++++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 ++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +
src/backend/tcop/postgres.c | 59 ++++
src/backend/utils/init/postinit.c | 9 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 16 +
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 446 ++++++++++++++++++++++---
src/interfaces/libpq/fe-exec.c | 6 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 48 ++-
src/test/recovery/t/001_stream_rep.pl | 146 +++++++-
21 files changed, 830 insertions(+), 80 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 82fc129..13d1c72 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8898,7 +8898,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, target_server_type, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 65c3fc6..f81a7d6 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1887,8 +1887,9 @@ if (!triggered)
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
+ During hot standby, the parameters <varname>in_recovery</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify
the database, connections during hot standby will act much like any other
database connection. If failover or switchover occurs, the database will
switch to normal processing mode. Sessions will remain connected while the
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index ea1909c..f8d0a6d 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1811,18 +1811,89 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ <literal>primary</literal> may alternatively be specified as <literal>read-write</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
+ <varlistentry id="libpq-connect-target-server-type" xreflabel="target_server_type">
+ <term><literal>target_server_type</literal></term>
+ <listitem>
+ <para>
+ The supported options for this parameter are <literal>primary</literal>,
+ <literal>standby</literal> and <literal>prefer-standby</literal>.
+ <literal>primary</literal> may alternatively be specified as <literal>read-write</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ This parameter overrides any connection type specified by <literal>target_session_attrs</literal>.
+ If multiple hosts are specified in the connection string, each host is tried in the order given
+ until a connection is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -2130,14 +2201,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>in_recovery</varname> was not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
@@ -7237,6 +7310,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ <indexterm>
+ <primary><envar>PGTARGETSERVERTYPE</envar></primary>
+ </indexterm>
+ <envar>PGTARGETSERVERTYPE</envar> behaves the same as the <xref
+ linkend="libpq-connect-target-server-type"/> connection parameter.
+ </para>
+ </listitem>
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 20d1fe0..053a3ca 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>in_recovery</varname> was not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index fd93bcf..49b8758 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7943,6 +7943,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 3c2b369..e6e63c6 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3082,6 +3082,34 @@ TerminateOtherDBBackends(Oid databaseId)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 4fa385b..957df0d 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -585,6 +585,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 9e0d5ec..f2790b6 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -140,6 +140,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c9424f1..6e0481e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -162,6 +162,14 @@ static bool UseSemiNewlineNewline = false; /* -j switch */
static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
@@ -191,6 +199,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -539,6 +548,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -3009,6 +3022,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index f4247ea..e8f9fd0 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -646,10 +646,13 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/*
* The postmaster already started the XLOG machinery, but we need to
* call InitXLOGAccess(), if the system isn't in hot-standby mode.
- * This is handled by calling RecoveryInProgress and ignoring the
- * result.
+ * This is handled by calling RecoveryInProgress.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 75fc6f1..49c8fa8 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -610,6 +610,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1850,6 +1851,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index a5c7d0c..08b3b04 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void TerminateOtherDBBackends(Oid databaseId);
extern void XidCacheRemoveRunningXids(TransactionId xid,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 5cb3969..6c243d4 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -43,6 +43,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index cfbe426..e5d42ac 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index bd30607..d29dd1f 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -68,6 +68,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 27c9bb4..1d30ec8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -351,9 +351,14 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
+ {"target_server_type", "PGTARGETSERVERTYPE",
+ NULL, NULL,
+ "Target-Server-Type", "", 17, /* sizeof("prefer-secondary") = 17 */
+ offsetof(struct pg_conn, target_server_type)},
+
/* Terminating entry --- MUST BE LAST */
{NULL, NULL, NULL, NULL,
NULL, NULL, 0}
@@ -995,6 +1000,30 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndRecordTargetServerType
+ *
+ * Validate a given target server option value and record the requested server
+ * type. All valid target_server_type option values are also allowed in
+ * target_session_attrs (as a single option value).
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndRecordTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "primary") == 0 || strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "prefer-standby") == 0 || strcmp(optionValue, "prefer-secondary") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0 || strcmp(optionValue, "secondary") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1390,8 +1419,9 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -1403,6 +1433,23 @@ connectOptions2(PGconn *conn)
}
/*
+ * Validate target_server_type option.
+ * If a target_server_type is specified, it overrides any target server
+ * type specified in target_session_attrs.
+ */
+ if (conn->target_server_type)
+ {
+ if (!validateAndRecordTargetServerType(conn->target_server_type, &conn->requested_server_type))
+ {
+ conn->status = CONNECTION_BAD;
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("invalid target_server_type value: \"%s\"\n"),
+ conn->target_server_type);
+ return false;
+ }
+ }
+
+ /*
* Only if we get this far is it appropriate to try to connect. (We need a
* state flag, rather than just the boolean result of this function, in
* case someone tries to PQreset() the PGconn.)
@@ -2227,6 +2274,110 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type. The connection state is set to
+ * try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedRecoveryConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2309,6 +2460,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2345,12 +2497,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_rw_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to read-only servers
+ * and should now try to re-connect to a previously-connected-to
+ * read-write server, whose host index is recorded in which_rw_host.
+ */
+ conn->whichhost = conn->which_rw_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_rw_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
+ else
conn->whichhost++;
/* Drop any address info for previous host */
@@ -3560,38 +3731,102 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a primary (not-in-recovery / read-write) connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if ((conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->which_rw_host == -2)
+ goto consume_checked_target_connection;
+
+ rejectCheckedRecoveryConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the Requested type of connection is prefer-standby, then record
+ * this host index and try other specified hosts before considering it later.
+ * If the requested type of connection is standby, ignore this connection.
+ */
+
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->which_rw_host == -2)
+ goto consume_checked_target_connection;
+ /* Close connection politely. */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ if (conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3690,42 +3925,149 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is primary,
+ * ignore it. Server is read-write and requested type is
+ * prefer-standby, record it for the first time and try to
+ * consume in the next scan (it means no standby server
+ * was found in the first scan). Server is read-write and
+ * requested type is standby, ignore this connection.
+ */
+ if ((readonly_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list of
+ * connections, as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->which_rw_host == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested type is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested type is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
*/
- conn->try_next_host = true;
+ if (conn->which_rw_host == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ rejectCheckedRecoveryConnection(conn);
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3738,7 +4080,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3754,7 +4096,7 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
"on server \"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3766,7 +4108,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3910,6 +4251,9 @@ makeEmptyPGconn(void)
conn->try_gss = true;
#endif
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_rw_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237..be73aa7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1058,7 +1058,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1112,6 +1112,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 3b6a9fb..9c59feb 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1de91ae..8033fc0 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,15 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +379,33 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values:
+ * "any"
+ * "primary" (or "read-write")
+ * "prefer-standby" (or "prefer-secondary")
+ * "standby" (or "secondary").
+ */
char *target_session_attrs;
+ /*
+ * Type of server to connect to. Possible values:
+ * "primary" (or "read-write")
+ * "prefer-standby" (or "prefer-secondary")
+ * "standby" (or "secondary").
+ * This overrides any connection type specified by target_session_attrs.
+ * This option is almost a synonym for the target_session_attrs option, except
+ * its purpose is to closely reflect the similar PGJDBC targetServerType option.
+ * Note also that this option only accepts single option values, whereas in
+ * future, target_session_attrs may accept multiple session attribute values.
+ */
+ char *target_server_type;
+
+ /*
+ * The requested server type, derived from target_session_attrs / target_server_type.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +439,17 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first read-write host encountered (if any) in the connection string.
+ *
+ * The initial value is -1, indicating that no read-write host has yet been found.
+ * It is then set to the index of the first read-write host, if one is found in the
+ * connection string during processing. If a second connection attempt is later made
+ * to that read-write host, which_rw_host is then set to -2 to avoid recursion during
+ * processing (and whichhost is set to the read-write host index).
+ */
+ int which_rw_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +480,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
@@ -540,7 +585,6 @@ struct pg_cancel
int be_key; /* key of backend --- needed for cancels */
};
-
/* String descriptions of the ExecStatusTypes.
* direct use of this array is deprecated; call PQresStatus() instead.
*/
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 778f11b..761d34d 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 62;
# Initialize master node
my $node_master = get_new_node('master');
@@ -121,6 +121,150 @@ test_target_session_attrs($node_master, $node_standby_1, $node_master, "any",
test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
"any", 0);
+# Connect to master in "primary" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_master,
+ "primary", 0);
+
+# Connect to master in "primary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to master in "prefer-secondary" mode with master,master list.
+test_target_session_attrs($node_master, $node_master, $node_master,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-ssecondary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with master,standby1 list.
+test_target_session_attrs($node_master, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,master list.
+test_target_session_attrs($node_standby_1, $node_master, $node_standby_1,
+ "secondary", 0);
+
+# Tests for connection parameter target_server_type
+note "testing connection parameter \"target_server_type\"";
+
+# Routine designed to run tests on the connection parameter
+# target_server_type with multiple nodes.
+sub test_target_server_type
+{
+ my $node1 = shift;
+ my $node2 = shift;
+ my $target_node = shift;
+ my $mode = shift;
+ my $status = shift;
+
+ my $node1_host = $node1->host;
+ my $node1_port = $node1->port;
+ my $node1_name = $node1->name;
+ my $node2_host = $node2->host;
+ my $node2_port = $node2->port;
+ my $node2_name = $node2->name;
+
+ my $target_name = $target_node->name;
+
+ # Build connection string for connection attempt.
+ my $connstr = "host=$node1_host,$node2_host ";
+ $connstr .= "port=$node1_port,$node2_port ";
+ $connstr .= "target_server_type=$mode";
+
+ # The client used for the connection does not matter, only the backend
+ # point does.
+ my ($ret, $stdout, $stderr) =
+ $node1->psql('postgres', 'SHOW port;',
+ extra_params => [ '-d', $connstr ]);
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+
+ return;
+}
+
+# Connect to master in "read-write" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_master,
+ "read-write", 0);
+
+# Connect to master in "read-write" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_master,
+ "read-write", 0);
+
+# Connect to master in "primary" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_master,
+ "primary", 0);
+
+# Connect to master in "primary" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_master,
+ "primary", 0);
+
+# Connect to master in "prefer-standby" mode with master,master list.
+test_target_server_type($node_master, $node_master, $node_master,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to master in "prefer-secondary" mode with master,master list.
+test_target_server_type($node_master, $node_master, $node_master,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with master,standby1 list.
+test_target_server_type($node_master, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,master list.
+test_target_server_type($node_standby_1, $node_master, $node_standby_1,
+ "secondary", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
On 6 Jul 2020, at 14:19, Greg Nancarrow <gregn4422@gmail.com> wrote:
This patch no longer applies, can you please submit a rebased version? I've
marked the entry as Waiting on Author in the meantime.Here's a rebased version of the patch.
Thanks, but now the tests no longer work as the nodes in the test suite are
renamed. While simple enough for a committer to fix, it's always good to see
the tests pass in the CFBot to make sure the variable name error isn't hiding
an actual test error.
cheers ./daniel
Thanks, but now the tests no longer work as the nodes in the test suite are
renamed. While simple enough for a committer to fix, it's always good to see
the tests pass in the CFBot to make sure the variable name error isn't hiding
an actual test error.
Rebased patch attached, all tests currently working as of Jul 19
(a766d6ca22ac7c233e69c896ae0c5f19de916db4).
Attachments:
v17-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchapplication/octet-stream; name=v17-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchDownload
From 9f957fa1556725f7a86e0a4f61b52b3f1bbfa571 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Sun, 19 Jul 2020 21:45:15 +1000
Subject: [PATCH v17] Enhance libpq target_session_attrs and add
target_server_type.
Enhance the connection parameter "target_session_attrs" to support values
primary/standby/prefer-standby (using the existing "read-write" as a synonym
for "primary"). To provide closer alignment with similar functionality in the
PGJDBC driver, add a new connection parameter "target_server_type".
Add "in_recovery" as a GUC_REPORT variable, to update clients when the server
is in recovery mode. This improves the speed of client connections to a standby
server, by avoiding the need to execute a command to determine if the server is
in recovery mode.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode exit
to all backends and their respective clients.
Some parts of the code are taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
contrib/postgres_fdw/expected/postgres_fdw.out | 2 +-
doc/src/sgml/high-availability.sgml | 5 +-
doc/src/sgml/libpq.sgml | 111 +++++-
doc/src/sgml/protocol.sgml | 8 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 ++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +
src/backend/tcop/postgres.c | 59 ++++
src/backend/utils/init/postinit.c | 9 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 16 +
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 445 ++++++++++++++++++++++---
src/interfaces/libpq/fe-exec.c | 6 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 48 ++-
src/test/recovery/t/001_stream_rep.pl | 146 +++++++-
21 files changed, 830 insertions(+), 79 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 90db550..ba0ba6e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8898,7 +8898,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, target_server_type, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 6a9184f..e2b8cfa 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1886,8 +1886,9 @@ if (!triggered)
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
+ During hot standby, the parameters <varname>in_recovery</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify
the database, connections during hot standby will act much like any other
database connection. If failover or switchover occurs, the database will
switch to normal processing mode. Sessions will remain connected while the
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index f7b765f..26673c9 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1811,18 +1811,89 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ <literal>primary</literal> may alternatively be specified as <literal>read-write</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
+ <varlistentry id="libpq-connect-target-server-type" xreflabel="target_server_type">
+ <term><literal>target_server_type</literal></term>
+ <listitem>
+ <para>
+ The supported options for this parameter are <literal>primary</literal>,
+ <literal>standby</literal> and <literal>prefer-standby</literal>.
+ <literal>primary</literal> may alternatively be specified as <literal>read-write</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ This parameter overrides any connection type specified by <literal>target_session_attrs</literal>.
+ If multiple hosts are specified in the connection string, each host is tried in the order given
+ until a connection is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, a connection in which
+ the server is in recovery mode is preferred. If no such connections can be found,
+ then a connection in which the server is not in recovery mode will be considered.
+ </para>
+
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -2130,14 +2201,16 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>in_recovery</varname> was not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
@@ -7237,6 +7310,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ <indexterm>
+ <primary><envar>PGTARGETSERVERTYPE</envar></primary>
+ </indexterm>
+ <envar>PGTARGETSERVERTYPE</envar> behaves the same as the <xref
+ linkend="libpq-connect-target-server-type"/> connection parameter.
+ </para>
+ </listitem>
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 8b00235..36ac534 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,16 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>in_recovery</varname> was not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0a97b1d..f1c3e9d 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7943,6 +7943,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index b448533..3917e24 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3082,6 +3082,34 @@ TerminateOtherDBBackends(Oid databaseId)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 4fa385b..957df0d 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -585,6 +585,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index f522983..aa8841b 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -140,6 +140,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c9424f1..6e0481e 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -162,6 +162,14 @@ static bool UseSemiNewlineNewline = false; /* -j switch */
static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
@@ -191,6 +199,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -539,6 +548,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -3009,6 +3022,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index f4247ea..e8f9fd0 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -646,10 +646,13 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/*
* The postmaster already started the XLOG machinery, but we need to
* call InitXLOGAccess(), if the system isn't in hot-standby mode.
- * This is handled by calling RecoveryInProgress and ignoring the
- * result.
+ * This is handled by calling RecoveryInProgress.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 99a3e4f..6f65448 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -615,6 +615,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1855,6 +1856,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index a5c7d0c..08b3b04 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -113,6 +113,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void TerminateOtherDBBackends(Oid databaseId);
extern void XidCacheRemoveRunningXids(TransactionId xid,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 5cb3969..6c243d4 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -43,6 +43,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index cfbe426..e5d42ac 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index bd30607..d29dd1f 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -68,6 +68,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7bee9dd..0005cf3 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -351,9 +351,14 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
+ {"target_server_type", "PGTARGETSERVERTYPE",
+ NULL, NULL,
+ "Target-Server-Type", "", 17, /* sizeof("prefer-secondary") = 17 */
+ offsetof(struct pg_conn, target_server_type)},
+
/* Terminating entry --- MUST BE LAST */
{NULL, NULL, NULL, NULL,
NULL, NULL, 0}
@@ -1001,6 +1006,30 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndRecordTargetServerType
+ *
+ * Validate a given target server option value and record the requested server
+ * type. All valid target_server_type option values are also allowed in
+ * target_session_attrs (as a single option value).
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndRecordTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "primary") == 0 || strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "prefer-standby") == 0 || strcmp(optionValue, "prefer-secondary") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0 || strcmp(optionValue, "secondary") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1396,8 +1425,9 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
@@ -1409,6 +1439,23 @@ connectOptions2(PGconn *conn)
}
/*
+ * Validate target_server_type option.
+ * If a target_server_type is specified, it overrides any target server
+ * type specified in target_session_attrs.
+ */
+ if (conn->target_server_type)
+ {
+ if (!validateAndRecordTargetServerType(conn->target_server_type, &conn->requested_server_type))
+ {
+ conn->status = CONNECTION_BAD;
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("invalid target_server_type value: \"%s\"\n"),
+ conn->target_server_type);
+ return false;
+ }
+ }
+
+ /*
* Only if we get this far is it appropriate to try to connect. (We need a
* state flag, rather than just the boolean result of this function, in
* case someone tries to PQreset() the PGconn.)
@@ -2228,6 +2275,110 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type. The connection state is set to
+ * try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedRecoveryConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2310,6 +2461,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
default:
@@ -2346,12 +2498,31 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_rw_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to read-only servers
+ * and should now try to re-connect to a previously-connected-to
+ * read-write server, whose host index is recorded in which_rw_host.
+ */
+ conn->whichhost = conn->which_rw_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_rw_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
+ else
conn->whichhost++;
/* Drop any address info for previous host */
@@ -3560,38 +3731,102 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
+ * If a primary (not-in-recovery / read-write) connection is
+ * required, see if we have one.
*
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if ((conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
+ */
+ if (conn->which_rw_host == -2)
+ goto consume_checked_target_connection;
+
+ rejectCheckedRecoveryConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the Requested type of connection is prefer-standby, then record
+ * this host index and try other specified hosts before considering it later.
+ * If the requested type of connection is standby, ignore this connection.
+ */
+
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->which_rw_host == -2)
+ goto consume_checked_target_connection;
+ /* Close connection politely. */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ if (conn->which_rw_host == -1)
+ conn->which_rw_host = conn->whichhost;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3690,42 +3925,149 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is primary,
+ * ignore it. Server is read-write and requested type is
+ * prefer-standby, record it for the first time and try to
+ * consume in the next scan (it means no standby server
+ * was found in the first scan). Server is read-write and
+ * requested type is standby, ignore this connection.
+ */
+ if ((readonly_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- /* Not writable; fail this connection. */
+ /*
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list of
+ * connections, as it couldn't find any servers that
+ * are default read-only.
+ */
+ if (conn->which_rw_host == -2)
+ goto consume_checked_write_connection;
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_RECOVERY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_RECOVERY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in recovery mode and requested type is
+ * primary, ignore it. Server is not in recovery mode and
+ * requested type is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in recovery.
*/
- conn->try_next_host = true;
+ if (conn->which_rw_host == -2)
+ goto consume_checked_recovery_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ rejectCheckedRecoveryConnection(conn);
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_recovery_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3738,7 +4080,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3754,7 +4096,7 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
"on server \"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3907,6 +4249,9 @@ makeEmptyPGconn(void)
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_rw_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237..be73aa7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1058,7 +1058,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1112,6 +1112,10 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 3b6a9fb..9c59feb 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1de91ae..8033fc0 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,15 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +379,33 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values:
+ * "any"
+ * "primary" (or "read-write")
+ * "prefer-standby" (or "prefer-secondary")
+ * "standby" (or "secondary").
+ */
char *target_session_attrs;
+ /*
+ * Type of server to connect to. Possible values:
+ * "primary" (or "read-write")
+ * "prefer-standby" (or "prefer-secondary")
+ * "standby" (or "secondary").
+ * This overrides any connection type specified by target_session_attrs.
+ * This option is almost a synonym for the target_session_attrs option, except
+ * its purpose is to closely reflect the similar PGJDBC targetServerType option.
+ * Note also that this option only accepts single option values, whereas in
+ * future, target_session_attrs may accept multiple session attribute values.
+ */
+ char *target_server_type;
+
+ /*
+ * The requested server type, derived from target_session_attrs / target_server_type.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +439,17 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first read-write host encountered (if any) in the connection string.
+ *
+ * The initial value is -1, indicating that no read-write host has yet been found.
+ * It is then set to the index of the first read-write host, if one is found in the
+ * connection string during processing. If a second connection attempt is later made
+ * to that read-write host, which_rw_host is then set to -2 to avoid recursion during
+ * processing (and whichhost is set to the read-write host index).
+ */
+ int which_rw_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +480,7 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
@@ -540,7 +585,6 @@ struct pg_cancel
int be_key; /* key of backend --- needed for cancels */
};
-
/* String descriptions of the ExecStatusTypes.
* direct use of this array is deprecated; call PQresStatus() instead.
*/
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..707d4ab 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 62;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -121,6 +121,150 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to primary in "prefer-secondary" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-ssecondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "secondary", 0);
+
+# Tests for connection parameter target_server_type
+note "testing connection parameter \"target_server_type\"";
+
+# Routine designed to run tests on the connection parameter
+# target_server_type with multiple nodes.
+sub test_target_server_type
+{
+ my $node1 = shift;
+ my $node2 = shift;
+ my $target_node = shift;
+ my $mode = shift;
+ my $status = shift;
+
+ my $node1_host = $node1->host;
+ my $node1_port = $node1->port;
+ my $node1_name = $node1->name;
+ my $node2_host = $node2->host;
+ my $node2_port = $node2->port;
+ my $node2_name = $node2->name;
+
+ my $target_name = $target_node->name;
+
+ # Build connection string for connection attempt.
+ my $connstr = "host=$node1_host,$node2_host ";
+ $connstr .= "port=$node1_port,$node2_port ";
+ $connstr .= "target_server_type=$mode";
+
+ # The client used for the connection does not matter, only the backend
+ # point does.
+ my ($ret, $stdout, $stderr) =
+ $node1->psql('postgres', 'SHOW port;',
+ extra_params => [ '-d', $connstr ]);
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+
+ return;
+}
+
+# Connect to primary in "read-write" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_primary,
+ "read-write", 0);
+
+# Connect to primary in "read-write" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_primary,
+ "read-write", 0);
+
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_server_type($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to primary in "prefer-secondary" mode with primary,primary list.
+test_target_server_type($node_primary, $node_primary, $node_primary,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "secondary", 0);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
Hi Greg,
I have spent some time reading this discussion thread, and doing a code review of the latest (v17-0001) patch.
Below are my review comments; some are trivial, others not so much.
====
GENERAL COMMENT 1 ("any")
"any" should be included as valid option for target_server_type.
IIUC target_server_type was added mostly to have better alignment with JDBC options.
Both Vladimir [1]/messages/by-id/20200109152539.GA29017@alvherre.pgsql and Dave [2]/messages/by-id/20191227130828.GA21647@alvherre.pgsql already said that JDBC does have an "any" option.
[1]: /messages/by-id/20200109152539.GA29017@alvherre.pgsql
[2]: /messages/by-id/20191227130828.GA21647@alvherre.pgsql
Furthermore, the fe-connect.c function makeEmptyPGConn sets default:
+ conn->requested_server_type = SERVER_TYPE_ANY;
This means the default type of target_server_type is "any".
Since this is default, it should also be possible to assign the same value to explicitly.
(Parts of the v17 patch affected by this are itemised below)
====
GENERAL COMMENT 2 (Removal of pg_is_in_recovery)
Around 22/3/2019 Hari added a lot of pg_is_in_recovery code in his patch 0006 [1]/messages/by-id/20200109152539.GA29017@alvherre.pgsql
[1]: /messages/by-id/20200109152539.GA29017@alvherre.pgsql
Much later IIUC the latest v17 patch has taken onboard the recommendation from Alvaro and removed all that code:
"I would discard the whole thing about checking "SELECT pg_is_in_recovery()"" [2]/messages/by-id/20191227130828.GA21647@alvherre.pgsql
[2]: /messages/by-id/20191227130828.GA21647@alvherre.pgsql
However, it seems that not ALL parts of the original code got cleanly removed in v17.
There are a number of references to CONNECTION_CHECK_RECOVERY and pg_is_in_recovery still lurking.
(Parts of the v17 patch affected by this are itemised below)
====
COMMENT libpq.sgml (para blocks)
+ <para>
The v17 patch for target_session_attrs and target_server_type help is currently using <para> blocks for each of the possible option values.
This format is inconsistent document style with other variables in this SGML.
Other places are using sub-lists for option values. e.g. look at "six modes" of sslmode.
====
COMMENT libpq.sgml (cut/paste parameter description)
I don't think that target_server_type help should be just a cut/paste duplicate of target_session_attrs. It is confusing because it leaves the reader doubting the purpose of having such a duplication.
Suggest to simplify the target_server_type help like as follows:
--
target_server_type
The purpose of this parameter is to reflect the similar PGJDBC targetServerType.
The supported options are same as target_session_attrs.
This parameter overrides any connection type specified by target_session_attrs.
--
====
COMMENT libpq.sgml (pg_is_in_recovery)
(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
+ <para>
+ If this parameter is set to <literal>standby</literal>, only a connection in which
+ the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+ the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+ connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+ </para>
Suggest change to:
--
If this parameter is set to <literal>standby</literal>, only a connection in which the server is in recovery mode is considered acceptable. The recovery mode state is determined by the value of the in_recovery configuration parameter that is reported by the server upon successful connection. Otherwise, if the server is prior to version 14, only a connection in which read-write transactions are not accepted by default is considered acceptable. To determine whether the server supports read-write transactions, the query SHOW transaction_read_only will be sent upon any successful connection; if it returns on, it means the server doesn't support read-write transactions.
--
====
COMMENT libpq.sgml (Oxford comma)
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
The v17 patch should not alter the previous listing style.
====
COMMENT protocol.sgml (Oxford comma)
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname> and
+ <varname>in_recovery</varname>.
Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
The v17 patch should not alter the previous listing style.
====
QUESTION standby.c - SendRecoveryExitSignal
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
I wonder if this function is really necessary?
IIUC the SendRecoveryExitSignal is only called from one place (xlog.c).
Why not just call SendSignalToAllBackends directly from there and remove this extra layer?
====
COMMENT postgres.c (signal comment)
+ /* signal that work needs to be done */
+ recoveryExitInterruptPending = true;
Suggest change comment to say:
/* flag that work needs to be done */
====
COMMENT fe-connect.c (sizeof)
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
According to the SGML "prefer-secondary" is also an acceptable value for target_session_attrs, so the display field width should be 17 /* sizeof("prefer-secondary") */ not 15.
====
COMMENT fe-connect.c (CONNECTION_CHECK_RECOVERY)
@@ -2310,6 +2461,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_RECOVERY:
break;
(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
Probably this CONNECTION_CHECK_RECOVERY case should be removed.
====
COMMENT fe-connect.c - function validateAndRecordTargetServerType
As noted in GENERAL COMMENT 1, I suggest "any" needs to be included in this function as a valid option.
====
COMMENT fe-connect.c (target_session_attrs validation)
@@ -1396,8 +1425,9 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (strcmp(conn->target_session_attrs, "any") == 0)
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
I suggest introducing a 2nd function for target_session_attrs (e.g. validateAndRecordTargetSessionAttrs).
Even though these parameters are functionally the same today, in future they may not be [1]/messages/by-id/20200109152539.GA29017@alvherre.pgsql.
[1]: /messages/by-id/20200109152539.GA29017@alvherre.pgsql
Regardless, the special "any" handling can be removed from here because (from GENERAL COMMENT 1) the validateAndRecordTargetServerType should now accept "any".
====
COMMENT fe-connect.c (message typo)
Found an existing typo, unrelated to the v17 patch.
"target_settion_attrs", --> "target_session_attrs",
====
COMMENT fe-connect.c (libpq_gettext)
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("invalid target_server_type value: \"%s\"\n"),
+ conn->target_server_type);
The parameter name "target_server_type" should be separated from the format string as "%s", the same as is done by the libpq_gettext of the preceding code.
====
COMMENT fe-connect.c (indentation)
+ goto error_return;
+ }
}
+ else
conn->whichhost++;
Bad indentation of the else's statement.
====
COMMENT fe-connect.c (if/else complexity)
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
TBH I was unable to read this code without first drawing up a matrix of combinations to deduce what was going on.
It should not be so inscrutable.
Suggestion1:
Consider putting a large comment at the top of this CONNECTION_CHECK_TARGET to give the overview what this code is trying to acheive.
e.g. something like this:
---
Mode |in_recovery |version < 7.4 |version < 14 |version >= 14
---------------+------------+-------------------+---------------------------+-------------
ANY |NA |OK |OK |OK
PRIMARY |true |OK |SHOW transaction_read_only |keep_going
PRIMARY |false |OK |SHOW transaction_read_only |OK
PREFER_STANDBY |true |keep_going (or -2) |SHOW transaction_read_only |OK
PREFER_STANDBY |false |keep_going (or -2) |SHOW transaction_read_only |keep_going (or -2)
STANDBY |true |keep_going |SHOW transaction_read_only |OK
STANDBY |false |keep_going |SHOW transaction_read_only |keep_going
---
Suggestion2:
Consider to separate out the requested_server_type cases instead of trying to hand everything in the same else block. The code may be a bit longer, but by aligning it more closely with the SGML documentation it can be made easier to understand.
e.g. something like this:
---
if (conn->requested_server_type == SERVER_TYPE_PRIMARY) {
/* If not-in-recovery, reject, else OK. */
if (conn->in_recovery) {
rejectCheckedRecoveryConnection(conn);
goto keep_going;
}
goto consume_checked_target_connection;
}
if (conn->requested_server_type == SERVER_TYPE_STANDBY) {
/* Only a connection in recovery mode is acceptable. */
if (!conn->in_recovery) {
rejectCheckedRecoveryConnection(conn);
goto keep_going;
}
goto consume_checked_target_connection;
}
if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY) {
/* A connection in recovery mode is preferred. */
if (conn->in_recovery)
goto consume_checked_target_connection;
/*
* The following scenario is possible only for the
* prefer-standby type for the next pass of the list
* of connections, as it couldn't find any servers that
* are in recovery.
*/
if (conn->which_rw_host == -2)
goto consume_checked_target_connection;
/* reject function below remembers this r/w host index in case it is needed later */
rejectCheckedRecoveryConnection(conn);
goto keep_going;
}
---
====
COMMENT fe-connect.c (case CONNECTION_CHECK_RECOVERY)
(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
v17 patch has removed the previous call to pg_is_in_recovery.
IIUC this means that there is currently no way for the remaining CONNECTION_CHECK_RECOVERY case to even be executed.
If I am correct, then a significant slab of code (~100 lines) can be deleted.
See case CONNECTION_CHECK_RECOVERY (lines ~ 4007 thru 4110)
====
COMMENT fe-connect.c - function freePGConn (missing free?)
There is code to free(conn->target_session_attrs), but there is no code to free target_server_type.
Appears to be accidental omission.
====
COMMENT fe-exec.c (altered comment)
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, and
A comma was added.
Suggest avoid altering comments not directly related to the v17 patch logic.
====
COMMENT libpq-fe.h (CONNECTION_CHECK_RECOVERY)
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */
(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
Probably this CONNECTION_CHECK_RECOVERY case should be removed.
====
Kind Regards,
Peter Smith
---
Fujitsu Australia
Hi Greg,
I was able to successfully execute all the tests of the v17-0001 patch.
But I do have a couple of additional review comments about the test code.
====
COMMENT - missing "any" tests
In my earlier code review (previous email) I suggested that "any" should be added as valid option to the target_server_type parameter.
But this now means there are some missing test cases for
target_server_type = "any"
====
COMMENT - negative tests?
IIUC when "standby" (aka "secondary") is specified, and there is no in_recovery server available, then the result should be an error like "could not make a readonly connection to server "
But I did not find any such error combination tests.
e.g. Where are these test cases?
target_session_attrs = "standby", when no standby is available
target_session_attrs = "secondary", when no standby is available
target_server_type = "standby", when no standby is available
target_server_type = "secondary", when no standby is available
--
And, similarly for "could not make a writable connection to server ".
e.g. Where are these test cases?
target_session_attrs = "primary", when no primary is available
target_session_attrs = "read-write", when no primary is available
target_server_type = "primary", when no primary is available
target_server_type = "read-write", when no primary is available
Kind Regards,
Peter Smith
---
Fujitsu Australia
On Fri, Dec 27, 2019 at 8:08 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
So, we can know whether server is primary/standby by checking
in_recovery, as opposed to knowing whether read-write which is done by
checking transaction_read_only. So we can keep read-write as a synonym
for "primary", and check in_recovery when used in servers that support
the new GUC, and check transaction_read_only in older servers.
I think it would be better to have read-write and read-only check
trnasaction_read_only, and primary and standby can check the new
thing. There can never be any real advantage in having synonyms for
the same thing, but there can be an advantage to letting users choose
the behavior they want.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From: Robert Haas <robertmhaas@gmail.com>
I think it would be better to have read-write and read-only check
trnasaction_read_only, and primary and standby can check the new
thing. There can never be any real advantage in having synonyms for
the same thing, but there can be an advantage to letting users choose
the behavior they want.
+1
"primary" is not always equal to "read-write". When normal users are only allowed to query data on a logically replicated database (ALTER USER SET default_transaction_read_only = on), it's the primary read-only server.
Regards
Takayuki Tsunakawa
Hi Peter,
I have updated the patch (attached) based on your comments, with
adjustments made for additional changes based on feedback (which I
tend to agree with) from Robert Haas and Tsunakawa san, who suggested
read-write/read-only should be functionally different to
primary/standby, and not just have "read-write" a synonym for
"primary".
I also thought it appropriate to remove "read-write", "standby" and
"prefer-standby" from accepted values for "target_server_type"
(instead just support "secondary" and "prefer-secondary") to match the
similar targetServerType PGJDBC option.
So currently have as supported option values:
target_session_attrs:
any/read-write/read-only/primary/standby(/secondary)/prefer-standby(/prefer-secondary)
target_server_type: any/primary/secondary/prefer-secondary
See my responses to your review comments below:
GENERAL COMMENT 1 ("any")
"any" should be included as valid option for target_server_type.
IIUC target_server_type was added mostly to have better alignment with JDBC options.
Both Vladimir [1] and Dave [2] already said that JDBC does have an "any" option.
[1] - /messages/by-id/CAB=Je-FwOVE=8gR1UDDZRnWZR65fRG40e8zW_U_6mnUqbce68g@mail.gmail.com
[2] - /messages/by-id/CADK3HHJ9316ji7L-97cJBY=wp4E3ddPMn8XdkNz6j8d9u0OhmQ@mail.gmail.comFurthermore, the fe-connect.c function makeEmptyPGConn sets default:
+ conn->requested_server_type = SERVER_TYPE_ANY;
This means the default type of target_server_type is "any".
Since this is default, it should also be possible to assign the same value to explicitly.(Parts of the v17 patch affected by this are itemised below)
GN RESPONSE: After checking the PGJDBC source and previous comments, I agree.
Have updated the patch to allow "any" for target_server_type.
====
GENERAL COMMENT 2 (Removal of pg_is_in_recovery)
Around 22/3/2019 Hari added a lot of pg_is_in_recovery code in his patch 0006 [1]
[1] - /messages/by-id/CAJrrPGd4YeA+N=xC+1XPVoGzMCATJZY4irVQEJ6i0aPqorUi7g@mail.gmail.comMuch later IIUC the latest v17 patch has taken onboard the recommendation from Alvaro and removed all that code:
"I would discard the whole thing about checking "SELECT pg_is_in_recovery()"" [2]
[2] - /messages/by-id/20191227130828.GA21647@alvherre.pgsqlHowever, it seems that not ALL parts of the original code got cleanly removed in v17.
There are a number of references to CONNECTION_CHECK_RECOVERY and pg_is_in_recovery still lurking.(Parts of the v17 patch affected by this are itemised below)
GN RESPONSE: Agree. The calling code was removed but somehow the
CONNECTION_CHECK_RECOVERY case block (and enum) was not removed. Also,
part of the documentation was not updated, for the case where the
server version is prior to 14.
I have updated the patch to correct this.
====
COMMENT libpq.sgml (para blocks)
+ <para>
The v17 patch for target_session_attrs and target_server_type help is currently using <para> blocks for each of the possible >option values.
This format is inconsistent document style with other variables in this SGML.
Other places are using sub-lists for option values. e.g. look at "six modes" of sslmode.
GN RESPONSE: True, but this was the case BEFORE the patch, and these
options are more complex than ones where sub-lists for option values
are used - there needs to be common explanation of what the option
synonyms are, and how the behaviour is version dependent, so it
doesn't really lend itself to simple list items, that would need to
cross-reference other list items.
====
COMMENT libpq.sgml (cut/paste parameter description)
I don't think that target_server_type help should be just a cut/paste duplicate of target_session_attrs. It is confusing >because it leaves the reader doubting the purpose of having such a duplication.
Suggest to simplify the target_server_type help like as follows:
--
target_server_type
The purpose of this parameter is to reflect the similar PGJDBC targetServerType.
The supported options are same as target_session_attrs.
This parameter overrides any connection type specified by target_session_attrs.
--
GN RESPONSE: Agree. Will update documentation, though with some
modifications to the wording because of changes in supported option
values already mentioned, and target_session_attrs could contain
non-server-type options in the future.
====
COMMENT libpq.sgml (pg_is_in_recovery)
(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
+ <para> + If this parameter is set to <literal>standby</literal>, only a connection in which + the server is in recovery mode is considered acceptable. If the server is prior to version 14, + the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful + connection; if it returns <literal>t</literal>, it means the server is in recovery mode. + </para>Suggest change to:
--
If this parameter is set to <literal>standby</literal>, only a connection in which the server is in recovery mode is considered >acceptable. The recovery mode state is determined by the value of the in_recovery configuration parameter that is reported by >the server upon successful connection. Otherwise, if the server is prior to version 14, only a connection in which read-write >transactions are not accepted by default is considered acceptable. To determine whether the server supports read-write >transactions, the query SHOW transaction_read_only will be sent upon any successful connection; if it returns on, it means the >server doesn't support read-write transactions.
--
GN RESPONSE: I've removed the residual references to
pg_is_in_recovery, and updated the documentation in a similar way.
====
COMMENT libpq.sgml (Oxford comma)
+ <varname>integer_datetimes</varname>, + <varname>standard_conforming_strings</varname> and + <varname>in_recovery</varname>.Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
The v17 patch should not alter the previous listing style.
GN RESPONSE: I have restored the Oxford comma to its former glory.
====
COMMENT protocol.sgml (Oxford comma)
+ <varname>integer_datetimes</varname>, + <varname>standard_conforming_strings</varname> and + <varname>in_recovery</varname>.Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
The v17 patch should not alter the previous listing style.
GN RESPONSE: I have restored the Oxford comma to its former glory.
====
QUESTION standby.c - SendRecoveryExitSignal
I wonder if this function is really necessary?
IIUC the SendRecoveryExitSignal is only called from one place (xlog.c).
Why not just call SendSignalToAllBackends directly from there and remove this extra layer?
GN RESPONSE: It's not much of a layer. It could be argued that having
a common function for this makes sense, in case additional code needs
to be added (so it's then not repeated/missed in places).
====
COMMENT postgres.c (signal comment)
+ /* signal that work needs to be done */ + recoveryExitInterruptPending = true;Suggest change comment to say:
/* flag that work needs to be done */
GN RESPONSE: Agree, have updated the patch.
====
COMMENT fe-connect.c (sizeof)
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */ + "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */According to the SGML "prefer-secondary" is also an acceptable value for target_session_attrs, so the display field width >should be 17 /* sizeof("prefer-secondary") */ not 15.
GN RESPONSE: I'm not sure about this, it's debatable. The intention of
these settings is to provide information for a "generic database
connection dialog". For "Target-Session-Attrs" I'd probably expect the
dialog to list the option "prefer-standby" rather than the
(PGJDBC-compatible) synonym "prefer-secondary" (whose length of 17 was
used in the case of "Target-Server-Type").
====
COMMENT fe-connect.c (CONNECTION_CHECK_RECOVERY)
@@ -2310,6 +2461,7 @@ PQconnectPoll(PGconn *conn) case CONNECTION_CHECK_WRITABLE: case CONNECTION_CONSUME: case CONNECTION_GSS_STARTUP: + case CONNECTION_CHECK_RECOVERY: break;(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
Probably this CONNECTION_CHECK_RECOVERY case should be removed.
GN RESPONSE: Agree, removed because it is no longer used.
====
COMMENT fe-connect.c - function validateAndRecordTargetServerType
As noted in GENERAL COMMENT 1, I suggest "any" needs to be included in this function as a valid option.
GN RESPONSE: Agree, updated patch.
====
COMMENT fe-connect.c (target_session_attrs validation)
@@ -1396,8 +1425,9 @@ connectOptions2(PGconn *conn) */ if (conn->target_session_attrs) { - if (strcmp(conn->target_session_attrs, "any") != 0 - && strcmp(conn->target_session_attrs, "read-write") != 0) + if (strcmp(conn->target_session_attrs, "any") == 0) + conn->requested_server_type = SERVER_TYPE_ANY; + else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))I suggest introducing a 2nd function for target_session_attrs (e.g. validateAndRecordTargetSessionAttrs).
Even though these parameters are functionally the same today, in future they may not be [1].
[1] - /messages/by-id/20200109152539.GA29017@alvherre.pgsqlRegardless, the special "any" handling can be removed from here because (from GENERAL COMMENT 1) the >validateAndRecordTargetServerType should now accept "any".
GN RESPONSE: Agree, have added separate validation functions.
====
COMMENT fe-connect.c (message typo)
Found an existing typo, unrelated to the v17 patch.
"target_settion_attrs", --> "target_session_attrs",
GN RESPONSE: Have updated the patch to correct that.
====
COMMENT fe-connect.c (libpq_gettext)
+ printfPQExpBuffer(&conn->errorMessage, + libpq_gettext("invalid target_server_type value: \"%s\"\n"), + conn->target_server_type);The parameter name "target_server_type" should be separated from the format string as "%s", the same as is done by the >libpq_gettext of the preceding code.
GN RESPONSE: Agree, was not correct in the v17 patch, have updated the patch.
====
COMMENT fe-connect.c (indentation)
+ goto error_return; + } } + else conn->whichhost++;Bad indentation of the else's statement.
GN RESPONSE: Updated the patch to fix that.
====
COMMENT fe-connect.c (if/else complexity)
+ else if ((conn->in_recovery && + conn->requested_server_type == SERVER_TYPE_PRIMARY) || + (!conn->in_recovery && + (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY || + conn->requested_server_type == SERVER_TYPE_STANDBY))) + {TBH I was unable to read this code without first drawing up a matrix of combinations to deduce what was going on.
It should not be so inscrutable.Suggestion1:
Consider putting a large comment at the top of this CONNECTION_CHECK_TARGET to give the overview what this code is trying to >acheive.
...Suggestion2:
Consider to separate out the requested_server_type cases instead of trying to hand everything in the same else block. The code >may be a bit longer, but by aligning it more closely with the SGML documentation it can be made easier to understand.
GN RESPONSE: Some slight restructuring has been made and comments
updated to express the logic in words, to assist in understanding.
(It's actually not that bad, but maybe I've been looking at this for too long).
====
COMMENT fe-connect.c (case CONNECTION_CHECK_RECOVERY)
(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
v17 patch has removed the previous call to pg_is_in_recovery.
IIUC this means that there is currently no way for the remaining CONNECTION_CHECK_RECOVERY case to even be executed.If I am correct, then a significant slab of code (~100 lines) can be deleted.
See case CONNECTION_CHECK_RECOVERY (lines ~ 4007 thru 4110)
GN RESPONSE: Agree, have removed code that is no longer called.
====
COMMENT fe-connect.c - function freePGConn (missing free?)
There is code to free(conn->target_session_attrs), but there is no code to free target_server_type.
Appears to be accidental omission.
GN RESPONSE: Have added missing free(), oops.
====
COMMENT fe-exec.c (altered comment)
- * Special hacks: remember client_encoding and + * Special hacks: remember client_encoding, andA comma was added.
Suggest avoid altering comments not directly related to the v17 patch logic.
GN RESPONSE: Have removed the (Oxford!) comma, accidently added.
====
COMMENT libpq-fe.h (CONNECTION_CHECK_RECOVERY)
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */ + CONNECTION_CHECK_RECOVERY /* Check whether server is in recovery */(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)
Probably this CONNECTION_CHECK_RECOVERY case should be removed.
GN RESPONSE: Have removed, no longer used.
But I do have a couple of additional review comments about the test code.
====
COMMENT - missing "any" tests
In my earlier code review (previous email) I suggested that "any" should be added as valid option to the target_server_type >parameter.
But this now means there are some missing test cases for
target_server_type = "any"
GN RESPONSE: Have added "any" tests for target_server_type.
====
COMMENT - negative tests?
IIUC when "standby" (aka "secondary") is specified, and there is no in_recovery server available, then the result should be an >error like "could not make a readonly connection to server "
But I did not find any such error combination tests.
e.g. Where are these test cases?
target_session_attrs = "standby", when no standby is available
target_session_attrs = "secondary", when no standby is available
target_server_type = "standby", when no standby is available
target_server_type = "secondary", when no standby is available--
And, similarly for "could not make a writable connection to server ".
e.g. Where are these test cases?
target_session_attrs = "primary", when no primary is available
target_session_attrs = "read-write", when no primary is available
target_server_type = "primary", when no primary is available
target_server_type = "read-write", when no primary is available
GN RESPONSE: No such negative tests existed for target_session_attrs
prior to this patch.
I have added some negative tests for both target_session_attrs and
target_server_type.
Note that in the v18 patch, "standby" and "read-write" are no longer
allowed for "target_server_type" (since not PGJDBC driver compatible).
Also, "read-write" is no longer considered a synonym for "primary" -
"read-write" means writeable (non read-only) and "primary" means not
in recovery.
Tests were adjusted accordingly.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v18-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchapplication/octet-stream; name=v18-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchDownload
From c9da78a38611d1b3ac5d259d8c1d4ca01ac01658 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Tue, 18 Aug 2020 20:48:43 +1000
Subject: [PATCH v18] Enhance libpq target_session_attrs and add
target_server_type.
Enhance the connection parameter "target_session_attrs" to support new values
read-only/primary/standby/prefer-standby. To provide closer alignment with
similar functionality in the PGJDBC driver, add a new connection parameter
"target_server_type".
Add "in_recovery" as a GUC_REPORT variable, to update clients when the server
is in recovery mode. This improves the speed of client connections to a standby
server, by avoiding the need to execute a command to determine if the server is
in recovery mode. Similarly, enhance "transaction_read_only" to be a GUC_REPORT
variable, for client connections to read-only/read-write servers.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode exit
to all backends and their respective clients.
Some parts of the code are taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
contrib/postgres_fdw/expected/postgres_fdw.out | 2 +-
doc/src/sgml/high-availability.sgml | 5 +-
doc/src/sgml/libpq.sgml | 105 +++++-
doc/src/sgml/protocol.sgml | 9 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 ++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +
src/backend/tcop/postgres.c | 60 ++++
src/backend/utils/init/postinit.c | 9 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 18 +-
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 464 ++++++++++++++++++++++---
src/interfaces/libpq/fe-exec.c | 10 +-
src/interfaces/libpq/libpq-int.h | 55 ++-
src/test/recovery/t/001_stream_rep.pl | 202 ++++++++++-
20 files changed, 900 insertions(+), 90 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 90db550..ba0ba6e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8898,7 +8898,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, target_server_type, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index d6f79fc..afb80e6 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1885,8 +1885,9 @@ if (!triggered)
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
+ During hot standby, the parameters <varname>in_recovery</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify
the database, connections during hot standby will act much like any other
database connection. If failover or switchover occurs, the database will
switch to normal processing mode. Sessions will remain connected while the
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index f7b765f..de31b34 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1811,18 +1811,81 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-write</literal>, only a connection in which
+ read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions, then if the server is version 14
+ or greater, the support of read-write transactions is determined by the value of the
+ <varname>transaction_read_only</varname> configuration parameter that is reported by
+ the server upon successful connection. Otherwise if the server is prior to version 14,
+ the query <literal>SHOW transaction_read_only</literal> will be sent upon any successful
+ connection; if it returns <literal>on</literal>, it means the server doesn't support
+ read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. The recovery mode state is determined by the value of the
+ <varname>in_recovery</varname> configuration parameter that is reported by the server upon
+ successful connection. Otherwise, if the server is prior to version 14, only a connection in
+ which read-write transactions are accepted by default is considered acceptable. To determine
+ whether the server supports read-write transactions (or only read-only transactions), the
+ query <literal>SHOW transaction_read_only</literal> will be sent upon any successful
+ connection; if it returns <literal>on</literal>, it means the server doesn't support
+ read-write transactions.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, then if the server is version 14 or
+ greater, only a connection in which the server is in recovery mode is considered acceptable.
+ Otherwise, if the server is prior to version 14, only a connection for which the server only
+ supports read-only transactions is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, then if the server is version
+ 14 or greater, a connection in which the server is in recovery mode is preferred. Otherwise,
+ if the server is prior to version 14, a connection for which the server only supports
+ read-only transactions is preferred. If no such connections can be found, then a connection
+ in which the server is not in recovery mode (server is version 14 or greater) or a
+ connection for which the server supports read-write transactions (server is prior to version
+ 14) will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
+ <varlistentry id="libpq-connect-target-server-type" xreflabel="target_server_type">
+ <term><literal>target_server_type</literal></term>
+ <listitem>
+ <para>
+ The purpose of this parameter is to reflect the similar PGJDBC <literal>targetServerType</literal>.
+ The supported options are a subset of those for <literal>target_session_attrs</literal>, namely
+ <literal>any</literal>, <literal>primary</literal>, <literal>secondary</literal> and
+ <literal>prefer-secondary</literal>. This parameter overrides any connection type specified by
+ <literal>target_session_attrs</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -2130,14 +2193,18 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname>, and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
@@ -7237,6 +7304,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ <indexterm>
+ <primary><envar>PGTARGETSERVERTYPE</envar></primary>
+ </indexterm>
+ <envar>PGTARGETSERVERTYPE</envar> behaves the same as the <xref
+ linkend="libpq-connect-target-server-type"/> connection parameter.
+ </para>
+ </listitem>
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 8b00235..643d171 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,17 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname>, and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname> were not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 09c01ed..4129069 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7940,6 +7940,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index e687cde..0af2e30 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3682,6 +3682,34 @@ TerminateOtherDBBackends(Oid databaseId)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 4fa385b..957df0d 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -585,6 +585,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 52b2809..347b32d 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -140,6 +140,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c9424f1..9fa16ec 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -163,6 +163,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -191,6 +200,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -539,6 +549,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -3009,6 +3023,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* flag that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index d4ab4c7..ce2d0b9 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -646,10 +646,13 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/*
* The postmaster already started the XLOG machinery, but we need to
* call InitXLOGAccess(), if the system isn't in hot-standby mode.
- * This is handled by calling RecoveryInProgress and ignoring the
- * result.
+ * This is handled by calling RecoveryInProgress.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index de87ad6..f33cb68 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -615,6 +615,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1618,7 +1619,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT, GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
@@ -1845,6 +1846,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index ea8a876..d75db13 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -80,6 +80,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void TerminateOtherDBBackends(Oid databaseId);
extern void XidCacheRemoveRunningXids(TransactionId xid,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 5cb3969..6c243d4 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -43,6 +43,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index faaf1d3..d5c822d 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index bd30607..d29dd1f 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -68,6 +68,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7bee9dd..4326580 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -351,9 +351,14 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
+ {"target_server_type", "PGTARGETSERVERTYPE",
+ NULL, NULL,
+ "Target-Server-Type", "", 17, /* sizeof("prefer-secondary") = 17 */
+ offsetof(struct pg_conn, target_server_type)},
+
/* Terminating entry --- MUST BE LAST */
{NULL, NULL, NULL, NULL,
NULL, NULL, 0}
@@ -1001,6 +1006,58 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndGetTargetServerType
+ *
+ * Validate a given target_server_type option value and get the requested server type.
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "any") == 0)
+ *requestedServerType = SERVER_TYPE_ANY;
+ else if (strcmp(optionValue, "primary") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "prefer-secondary") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "secondary") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+
+ return true;
+}
+
+/*
+ * validateAndGetTargetServerTypeFromSessionAttrs
+ *
+ * Validate a given target_session_attrs value and get the requested server type.
+ * All valid target_server_type option values are also allowed in target_session_attrs
+ * (as a single option value).
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerTypeFromSessionAttrs(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (!validateAndGetTargetServerType(optionValue, requestedServerType))
+ {
+ if (strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_READ_WRITE;
+ else if (strcmp(optionValue, "read-only") == 0)
+ *requestedServerType = SERVER_TYPE_READ_ONLY;
+ else if (strcmp(optionValue, "prefer-standby") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+ }
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1396,19 +1453,36 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (!validateAndGetTargetServerTypeFromSessionAttrs(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid %s value: \"%s\"\n"),
- "target_settion_attrs",
+ "target_session_attrs",
conn->target_session_attrs);
return false;
}
}
/*
+ * Validate target_server_type option. If a target_server_type is
+ * specified, it overrides any target server type specified in
+ * target_session_attrs.
+ */
+ if (conn->target_server_type)
+ {
+ if (!validateAndGetTargetServerType(conn->target_server_type, &conn->requested_server_type))
+ {
+ conn->status = CONNECTION_BAD;
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("invalid %s value: \"%s\"\n"),
+ "target_server_type",
+ conn->target_server_type);
+ return false;
+ }
+ }
+
+ /*
* Only if we get this far is it appropriate to try to connect. (We need a
* state flag, rather than just the boolean result of this function, in
* case someone tries to PQreset() the PGconn.)
@@ -2228,6 +2302,116 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (read-write/read-only).
+ * The connection state is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index
+ * hasn't been set, then it is set to the index of this connection's host, so
+ * that a connection to this host can be made again in the event that no
+ * connection to a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_or_rw_host == -1)
+ {
+ /*
+ * This can only happen if server version < 14 (for which standby
+ * is regarded as read-only)
+ */
+ conn->which_primary_or_rw_host = conn->whichhost;
+ }
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the primary host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedRecoveryConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_or_rw_host == -1)
+ conn->which_primary_or_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2346,13 +2530,34 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_primary_or_rw_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to standby servers
+ * (or to read-only servers for server verson < 14) and should
+ * now try to re-connect to a previously-connected-to primary
+ * server (or read-write server for server version < 14),
+ * whose host index is recorded in which_primary_or_rw_host.
+ */
+ conn->whichhost = conn->which_primary_or_rw_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_primary_or_rw_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3560,38 +3765,169 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
/*
- * If a read-write connection is required, see if we have one.
- *
* Servers before 7.4 lack the transaction_read_only GUC, but
* by the same token they don't have any read-only mode, so we
* may just skip the test in that case.
*/
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if (conn->sversion >= 70400)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * If a read-write or read-only connection is required,
+ * see if we have one.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY)
+ {
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the
+ * PQsendQuery attempt. This is necessary because
+ * PQsendQuery is going to reset
+ * conn->errorMessage, so we would lose error
+ * messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->transaction_read_only &&
+ conn->requested_server_type == SERVER_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * Server is read-only but requested read-write,
+ * or server is read-write but requested
+ * read-only, reject and continue to process any
+ * further hosts ...
+ */
+
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+ else if (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the
+ * PQsendQuery attempt. This is necessary because
+ * PQsendQuery is going to reset
+ * conn->errorMessage, so we would lose error
+ * messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in recovery but requested primary, or
+ * server is not in recovery but requested
+ * prefer-standby/standby.
+ */
+
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of
+ * the list of connections as it couldn't find
+ * any servers that are in recovery.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Reject and continue to process any further
+ * hosts ...
+ */
+ rejectCheckedRecoveryConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+ }
+
+ /*
+ * For servers before 7.4 (which don't support read-only), if
+ * the requested type of connection is prefer-standby, then
+ * record this host index and try other specified hosts before
+ * considering it later. If the requested type of connection
+ * is read-only or standby, ignore this connection.
+ */
+
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are read-only.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Close connection politely. */
conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ sendTerminateConn(conn);
+
+ /* Record host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ if (conn->which_primary_or_rw_host == -1)
+ conn->which_primary_or_rw_host = conn->whichhost;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3663,6 +3999,7 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
return PGRES_POLLING_OK;
}
+
case CONNECTION_CHECK_WRITABLE:
{
const char *displayed_host;
@@ -3690,42 +4027,51 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is
+ * read-write (or requested server type is primary and
+ * server version < 14), ignore this connection. Server is
+ * read-write and requested type is read-only (or
+ * requested server type is standby and server version <
+ * 14), ignore this connection. Server is read-write
+ * (version < 14) and requested type is prefer-standby,
+ * record it for the first time and try to consume in the
+ * next scan (it means no read-only server is found in the
+ * first scan).
+ */
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_PRIMARY)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- /* Not writable; fail this connection. */
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the
+ * list of connections as it couldn't find any
+ * servers that are read-only.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ rejectCheckedReadOrWriteConnection(conn);
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3903,10 +4249,14 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_primary_or_rw_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
@@ -4075,6 +4425,8 @@ freePGconn(PGconn *conn)
free(conn->rowBuf);
if (conn->target_session_attrs)
free(conn->target_session_attrs);
+ if (conn->target_server_type)
+ free(conn->target_server_type);
termPQExpBuffer(&conn->errorMessage);
termPQExpBuffer(&conn->workBuffer);
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237..0605b23 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1058,7 +1058,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1112,6 +1112,14 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1de91ae..13376ee 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,17 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_READ_WRITE, /* Read-write server */
+ SERVER_TYPE_READ_ONLY, /* Read-only server */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +381,30 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: "any", "read-write",
+ * "read-only", "primary", "prefer-standby" (or "prefer-secondary"),
+ * "standby" (or "secondary").
+ */
char *target_session_attrs;
+ /*
+ * Type of server to connect to. Possible values: "any", "primary",
+ * "prefer-secondary", "secondary" This overrides any connection type
+ * specified by target_session_attrs. This option supports a subset of the
+ * target_session_attrs option values, and its purpose is to closely
+ * reflect the similar PGJDBC targetServerType option. Note also that this
+ * option only accepts single option values, whereas in future,
+ * target_session_attrs may accept multiple session attribute values.
+ */
+ char *target_server_type;
+
+ /*
+ * The requested server type, derived from target_session_attrs /
+ * target_server_type.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +438,24 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first primary host (or read-write host if server version
+ * < 14) encountered (if any) in the connection string. This is used
+ * during processing of requested server connection type
+ * SERVER_TYPE_PREFER_STANDBY.
+ *
+ * The initial value is -1, indicating that no primary host has yet been
+ * found. It is then set to the index of the first primary host, if one is
+ * found in the connection string during processing. If a second
+ * connection attempt is later made to that primary host (because no
+ * connection to a standby server could be made), which_primary_or_rw_host
+ * is then set to -2 to avoid recursion during subsequent processing (and
+ * whichhost is set to the primary host index). Note that for server
+ * versions < 14, a requested type of "primary" is regarded as
+ * "read-write" and "standby" is regarded as "read-only".
+ */
+ int which_primary_or_rw_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +486,8 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
@@ -540,7 +592,6 @@ struct pg_cancel
int be_key; /* key of backend --- needed for cancels */
};
-
/* String descriptions of the ExecStatusTypes.
* direct use of this array is deprecated; call PQresStatus() instead.
*/
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..a1e59c4 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 66;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -85,7 +85,7 @@ sub test_target_session_attrs
my $node2_port = $node2->port;
my $node2_name = $node2->name;
- my $target_name = $target_node->name;
+ my $target_name = $target_node->name if (defined $target_node);
# Build connection string for connection attempt.
my $connstr = "host=$node1_host,$node2_host ";
@@ -97,10 +97,25 @@ sub test_target_session_attrs
my ($ret, $stdout, $stderr) =
$node1->psql('postgres', 'SHOW port;',
extra_params => [ '-d', $connstr ]);
- is( $status == $ret && $stdout eq $target_node->port,
- 1,
- "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
- );
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
return;
}
@@ -121,6 +136,181 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to standby1 in "read-only" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "read-only", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to primary in "prefer-secondary" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-ssecondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "secondary", 0);
+
+# Fail to connect in "read-write" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "read-write", 2);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "read-only" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "read-only", 2);
+
+# Fail to connect in "standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "standby", 2);
+
+# Fail to connect in "secondary" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "secondary", 2);
+
+# Tests for connection parameter target_server_type
+note "testing connection parameter \"target_server_type\"";
+
+# Routine designed to run tests on the connection parameter
+# target_server_type with multiple nodes.
+sub test_target_server_type
+{
+ my $node1 = shift;
+ my $node2 = shift;
+ my $target_node = shift;
+ my $mode = shift;
+ my $status = shift;
+
+ my $node1_host = $node1->host;
+ my $node1_port = $node1->port;
+ my $node1_name = $node1->name;
+ my $node2_host = $node2->host;
+ my $node2_port = $node2->port;
+ my $node2_name = $node2->name;
+
+ my $target_name = $target_node->name if (defined $target_node);
+
+ # Build connection string for connection attempt.
+ my $connstr = "host=$node1_host,$node2_host ";
+ $connstr .= "port=$node1_port,$node2_port ";
+ $connstr .= "target_server_type=$mode";
+
+ # The client used for the connection does not matter, only the backend
+ # point does.
+ my ($ret, $stdout, $stderr) =
+ $node1->psql('postgres', 'SHOW port;',
+ extra_params => [ '-d', $connstr ]);
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+
+ return;
+}
+
+# Connect to primary in "any" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_primary, "any",
+ 0);
+
+# Connect to standby1 in "any" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "any", 0);
+
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "prefer-secondary" mode with primary,primary list.
+test_target_server_type($node_primary, $node_primary, $node_primary,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "secondary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "secondary", 0);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_server_type($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "secondary" mode with primary,primary list.
+test_target_server_type($node_primary, $node_primary, undef,
+ "secondary", 2);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
On Thu, Aug 20, 2020 at 10:26 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
I have updated the patch (attached) based on your comments, with
adjustments made for additional changes based on feedback (which I
tend to agree with) from Robert Haas and Tsunakawa san, who suggested
read-write/read-only should be functionally different to
primary/standby, and not just have "read-write" a synonym for
"primary".
I also thought it appropriate to remove "read-write", "standby" and
"prefer-standby" from accepted values for "target_server_type"
(instead just support "secondary" and "prefer-secondary") to match the
similar targetServerType PGJDBC option.
So currently have as supported option values:target_session_attrs:
any/read-write/read-only/primary/standby(/secondary)/prefer-standby(/prefer-secondary)
target_server_type: any/primary/secondary/prefer-secondary
+1 to your changes for the option values of these 2 variables.
Thanks for addressing my previous review comments in the v18 patch.
I have re-reviewed v18. Below are some additional (mostly minor)
things I noticed.
====
COMMENT (help text)
The help text is probably accurate but it does seem a bit confusing still.
Example1:
+ <para>
+ If this parameter is set to <literal>read-write</literal>,
only a connection in which
+ read-write transactions are accepted by default is considered
acceptable. To determine
+ whether the server supports read-write transactions, then if
the server is version 14
+ or greater, the support of read-write transactions is
determined by the value of the
+ <varname>transaction_read_only</varname> configuration
parameter that is reported by
+ the server upon successful connection. Otherwise if the
server is prior to version 14,
+ the query <literal>SHOW transaction_read_only</literal> will
be sent upon any successful
+ connection; if it returns <literal>on</literal>, it means the
server doesn't support
+ read-write transactions.
+ </para>
That fragment "To determine whether the server supports read-write
transactions, then" seems redundant.
Example2:
The Parameter Value descriptions seem inconsistently worded. e.g.
* "read-write" gives details about how "SHOW transaction_read_only"
can be called to decide r/w server.
* but then "read-only" doesn't mention about it
* but then "primary" does
* but then "standby" doesn't
IMO if there was some up-front paragraphs to say how different
versions calculate the r/w support and recovery mode, then all the
different parameter values can be expressed in a much simpler way and
have less repetition (e.g they can all look like the "read-only" one
does now).
e.g. I mean something similar to this (which is same wording as yours,
just rearranged a bit):
--
SERVER STATES
If the server is version 14 or greater, the support of read-write
transactions is determined by the value of the transaction_read_only
configuration parameter that is reported by the server upon successful
connection. Otherwise if the server is prior to version 14, the query
SHOW transaction_read_only will be sent upon any successful
connection; if it returns on, it means the server doesn't support
read-write transaction
The recovery mode state is determined by the value of the in_recovery
configuration parameter that is reported by the server upon successful
connection
PARAMETER VALUES
If this parameter is set to read-write, only a connection in which
read-write transactions are accepted by default is considered
acceptable.
If this parameter is set to read-only, only a connection in which
read-only transactions are accepted by default is considered
acceptable.
If this parameter is set to primary, then if the server is version 14
or greater, only a connection in which the server is not in recovery
mode is considered acceptable. Otherwise, if the server is prior to
version 14, only a connection in which read-write transactions are
accepted by default is considered acceptable.
If this parameter is set to standby, then if the server is version 14
or greater, only a connection in which the server is in recovery mode
is considered acceptable. Otherwise, if the server is prior to version
14, only a connection for which the server only supports read-only
transactions is considered acceptable.
If this parameter is set to prefer-standby, then if the server is
version 14 or greater, a connection in which the server is in recovery
mode is preferred. Otherwise, if the server is prior to version 14, a
connection for which the server only supports read-only transactions
is preferred. If no such connections can be found, then a connection
in which the server is not in recovery mode (server is version 14 or
greater) or a connection for which the server supports read-write
transactions (server is prior to version 14) will be considered
--
====
COMMENT fe-connect.c (sizeof)
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
You said changing this 15 to 17 is debatable. So I will debate it.
IIUC the dispsize is defined as /* Field size in characters for dialog */
I imagine this could be used as potential max character length of a
text input field.
Therefore, so long as "prefer-secondary" remains a valid value for
target_session_attrs then I think dispsize ought to be 17 (not 15) to
accommodate it.
Otherwise setting to 15 may be preventing dialog entry of this
perfectly valid (albeit uncommon) value.
====
COMMENT (typo)
+ /*
+ * Type of server to connect to. Possible values: "any", "primary",
+ * "prefer-secondary", "secondary" This overrides any connection type
+ * specified by target_session_attrs. This option supports a subset of the
Missing period before "This overrides"
====
COMMENT (comment)
+ /*
+ * Type of server to connect to. Possible values: "any", "primary",
+ * "prefer-secondary", "secondary" This overrides any connection type
+ * specified by target_session_attrs. This option supports a subset of the
+ * target_session_attrs option values, and its purpose is to closely
+ * reflect the similar PGJDBC targetServerType option. Note also that this
+ * option only accepts single option values, whereas in future,
+ * target_session_attrs may accept multiple session attribute values.
+ */
+ char *target_server_type;
Perhaps the part saying "... in future, target_session_attrs may
accept multiple session attribute values." more rightly belongs as a
comment for the *target_session_attrs field.
====
COMMENT (comments)
@@ -436,6 +486,8 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only */
+ bool in_recovery; /* in_recovery */
Just repeating the field name does not make for a very useful comment.
Can it be improved?
====
COMMENT (blank line removal?)
@@ -540,7 +592,6 @@ struct pg_cancel
int be_key; /* key of backend --- needed for cancels */
};
-
Removal of this blank line is cleanup in some place unrelated to this patch.
====
COMMENT (typo in test comment)
+# Connect to standby1 in "prefer-ssecondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
Typo: "prefer-ssecondary"
====
COMMENT (fe-connect.c - suggest if/else instead of if/if)
+ /*
+ * For servers before 7.4 (which don't support read-only), if
+ * the requested type of connection is prefer-standby, then
+ * record this host index and try other specified hosts before
+ * considering it later. If the requested type of connection
+ * is read-only or standby, ignore this connection.
+ */
+
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
IIUC the only way to reach this code (because of all the previous
gotos) is when the server version is < 7.4.
So to make this more readable that "if" should ideally be "else if"
because the prior if block already says
+ if (conn->sversion >= 70400)
====
COMMENT (fe-connect - conn->sversion < 140000)
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the
+ * PQsendQuery attempt. This is necessary because
+ * PQsendQuery is going to reset
+ * conn->errorMessage, so we would lose error
+ * messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
I am suspicious of the duplicate code blocks for (conn->sversion < 140000).
Both appear to be doing exactly the same thing for all requests types
(excluding "any") so IMO these can be refactored into a single if
which is just beneath the check for (conn->sversion >= 70400). The
result can remove 25 lines and also be easier to read.
====
COMMENT (fe-connect.c - if comment)
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in recovery but requested primary, or
+ * server is not in recovery but requested
+ * prefer-standby/standby.
+ */
This comment does not have much value because it reads almost exactly
the same as the code it is describing.
Maybe it can be reworded to be more useful, or if not, just remove it?
====
COMMENT (fe-connect.c - CHECK_WRITABLE wrong goto?)
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_PRIMARY)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- /* Not writable; fail this connection. */
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the
+ * list of connections as it couldn't find any
+ * servers that are read-only.
+ */
+ goto consume_checked_target_connection;
+ }
Is this goto consume_checked_target_connection deliberate?
Previously (in the v17 patch) there was a another label, and so this
same code did goto consume_checked_write_connection;
The v17 code seems more correct than the current v18 code, which is
now jumping to a label not even in the same case block!
====
Kind Regards,
Peter Smith.
Fujitsu Australia
Hi Peter,
Thanks for the further review, an updated patch is attached. Please
see my responses to your comments below:
On Thu, Aug 20, 2020 at 11:36 AM Peter Smith <smithpb2250@gmail.com> wrote:
COMMENT (help text)
The help text is probably accurate but it does seem a bit confusing still.
...
IMO if there was some up-front paragraphs to say how different
versions calculate the r/w support and recovery mode, then all the
different parameter values can be expressed in a much simpler way and
have less repetition (e.g they can all look like the "read-only" one
does now).
GN RESPONSE:
I have updated the documentation, taking this view into account.
====
COMMENT fe-connect.c (sizeof)
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */ + "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */You said changing this 15 to 17 is debatable. So I will debate it.
IIUC the dispsize is defined as /* Field size in characters for dialog */
I imagine this could be used as potential max character length of a
text input field.Therefore, so long as "prefer-secondary" remains a valid value for
target_session_attrs then I think dispsize ought to be 17 (not 15) to
accommodate it.
Otherwise setting to 15 may be preventing dialog entry of this
perfectly valid (albeit uncommon) value.
GN RESPONSE:
My initial reasoning was that even though "prefer-secondary" is a
valid value, a GUI for target_session_attrs probably wouldn't present
that option, it would present "prefer-standby" instead (I was
imagining a drop-down menu, and it certainly wouldn't present both
"prefer-standby" and "prefer-secondary", as they are synonyms). If the
GUI did want to present the PGJDBC-compatible option values, it should
be looking at the dispsize for "Target-Server-Type" (which is 17, for
"prefer-secondary").
However, I guess there could be a number of ways to specify the option
value, even explicitly typing it into a textbox in the "database
connection dialog" that uses this information.
So in that case, I've updated the code, as you suggested, to use
dispsize=17 (for "prefer-secondary") in this case.
====
COMMENT (typo)
+ /* + * Type of server to connect to. Possible values: "any", "primary", + * "prefer-secondary", "secondary" This overrides any connection type + * specified by target_session_attrs. This option supports a subset of theMissing period before "This overrides"
GN RESPONSE: Fixed.
====
COMMENT (comment)
+ /* + * Type of server to connect to. Possible values: "any", "primary", + * "prefer-secondary", "secondary" This overrides any connection type + * specified by target_session_attrs. This option supports a subset of the + * target_session_attrs option values, and its purpose is to closely + * reflect the similar PGJDBC targetServerType option. Note also that this + * option only accepts single option values, whereas in future, + * target_session_attrs may accept multiple session attribute values. + */ + char *target_server_type;Perhaps the part saying "... in future, target_session_attrs may
accept multiple session attribute values." more rightly belongs as a
comment for the *target_session_attrs field.
GN RESPONSE:
I can't really compare and contrast the two parameters without
mentioning "target_session_attrs" here.
"target_session_attrs" implies the possibility of multiple attributes.
If the difference between the attributes is provided in separate bits
of information for each attribute, the reader may not pick up on this
subtle difference between them.
====
COMMENT (comments)
@@ -436,6 +486,8 @@ struct pg_conn pgParameterStatus *pstatus; /* ParameterStatus data */ int client_encoding; /* encoding id */ bool std_strings; /* standard_conforming_strings */ + bool transaction_read_only; /* transaction_read_only */ + bool in_recovery; /* in_recovery */Just repeating the field name does not make for a very useful comment.
Can it be improved?
GN RESPONSE: Yes, improved.
COMMENT (blank line removal?)
@@ -540,7 +592,6 @@ struct pg_cancel
int be_key; /* key of backend --- needed for cancels */
};-
Removal of this blank line is cleanup in some place unrelated to this patch.
GN RESPONSE:
Blank line put back - but this appears to be because pg_indent was NOT
previously run on this code prior to me running it.
COMMENT (typo in test comment)
+# Connect to standby1 in "prefer-ssecondary" mode with standby1,primary list. +test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1, + "prefer-secondary", 0); +Typo: "prefer-ssecondary"
GN RESPONSE: Fixed.
COMMENT (fe-connect.c - suggest if/else instead of if/if)
+ /* + * For servers before 7.4 (which don't support read-only), if + * the requested type of connection is prefer-standby, then + * record this host index and try other specified hosts before + * considering it later. If the requested type of connection + * is read-only or standby, ignore this connection. + */ + + if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY || + conn->requested_server_type == SERVER_TYPE_READ_ONLY || + conn->requested_server_type == SERVER_TYPE_STANDBY) + {IIUC the only way to reach this code (because of all the previous
gotos) is when the server version is < 7.4.So to make this more readable that "if" should ideally be "else if"
because the prior if block already says
+ if (conn->sversion >= 70400)
GN RESPONSE: Changed to "else if".
COMMENT (fe-connect - conn->sversion < 140000)
...
I am suspicious of the duplicate code blocks for (conn->sversion < 140000).
Both appear to be doing exactly the same thing for all requests types
(excluding "any") so IMO these can be refactored into a single if
which is just beneath the check for (conn->sversion >= 70400). The
result can remove 25 lines and also be easier to read.
GN RESPONSE:
I was able to refactor the code to make it a bit simpler and remove
the duplicate code block, after first adding a condition to exclude
"any".
COMMENT (fe-connect.c - if comment)
+ else if ((conn->in_recovery && + conn->requested_server_type == SERVER_TYPE_PRIMARY) || + (!conn->in_recovery && + (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY || + conn->requested_server_type == SERVER_TYPE_STANDBY))) + { + /* + * Server is in recovery but requested primary, or + * server is not in recovery but requested + * prefer-standby/standby. + */This comment does not have much value because it reads almost exactly
the same as the code it is describing.
Maybe it can be reworded to be more useful, or if not, just remove it?
GN RESPONSE: I've enhanced the comment.
COMMENT (fe-connect.c - CHECK_WRITABLE wrong goto?)
+ if ((readonly_server && + (conn->requested_server_type == SERVER_TYPE_READ_WRITE || + conn->requested_server_type == SERVER_TYPE_PRIMARY)) || + (!readonly_server && + (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY || + conn->requested_server_type == SERVER_TYPE_READ_ONLY || + conn->requested_server_type == SERVER_TYPE_STANDBY))) { - /* Not writable; fail this connection. */ + if (conn->which_primary_or_rw_host == -2) + { + /* + * This scenario is possible only for the + * prefer-standby type for the next pass of the + * list of connections as it couldn't find any + * servers that are read-only. + */ + goto consume_checked_target_connection; + }Is this goto consume_checked_target_connection deliberate?
Previously (in the v17 patch) there was a another label, and so this
same code did goto consume_checked_write_connection;The v17 code seems more correct than the current v18 code, which is
now jumping to a label not even in the same case block!
GN RESPONSE:
Not deliberate, seems to have been messed up (possibly by copying
another block, to get a comment), but has now been corrected.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v19-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchapplication/octet-stream; name=v19-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patchDownload
From 2d6a93124e7e6196c3792515ae0e1e80b6131a0c Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Fri, 21 Aug 2020 13:18:46 +1000
Subject: [PATCH v19] Enhance libpq target_session_attrs and add
target_server_type.
Enhance the connection parameter "target_session_attrs" to support new values
read-only/primary/standby/prefer-standby. To provide closer alignment with
similar functionality in the PGJDBC driver, add a new connection parameter
"target_server_type".
Add "in_recovery" as a GUC_REPORT variable, to update clients when the server
is in recovery mode. This improves the speed of client connections to a standby
server, by avoiding the need to execute a command to determine if the server is
in recovery mode. Similarly, enhance "transaction_read_only" to be a GUC_REPORT
variable, for client connections to read-only/read-write servers.
Add new SIGUSR1 handling interrupt to support reporting of recovery mode exit
to all backends and their respective clients.
Some parts of the code are taken from earlier development by
Elvis Pranskevichus and Tsunakawa Takayuki.
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
contrib/postgres_fdw/expected/postgres_fdw.out | 2 +-
doc/src/sgml/high-availability.sgml | 5 +-
doc/src/sgml/libpq.sgml | 108 +++++-
doc/src/sgml/protocol.sgml | 9 +-
src/backend/access/transam/xlog.c | 3 +
src/backend/storage/ipc/procarray.c | 28 ++
src/backend/storage/ipc/procsignal.c | 3 +
src/backend/storage/ipc/standby.c | 9 +
src/backend/tcop/postgres.c | 60 ++++
src/backend/utils/init/postinit.c | 9 +-
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 18 +-
src/include/storage/procarray.h | 1 +
src/include/storage/procsignal.h | 2 +
src/include/storage/standby.h | 1 +
src/include/tcop/tcopprot.h | 2 +
src/interfaces/libpq/fe-connect.c | 436 +++++++++++++++++++++----
src/interfaces/libpq/fe-exec.c | 10 +-
src/interfaces/libpq/libpq-int.h | 54 ++-
src/test/recovery/t/001_stream_rep.pl | 202 +++++++++++-
20 files changed, 869 insertions(+), 95 deletions(-)
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 90db550..ba0ba6e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -8898,7 +8898,7 @@ DO $d$
END;
$d$;
ERROR: invalid option "password"
-HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
+HINT: Valid options in this context are: service, passfile, channel_binding, connect_timeout, dbname, host, hostaddr, port, options, application_name, keepalives, keepalives_idle, keepalives_interval, keepalives_count, tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, sslcrl, requirepeer, ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, target_server_type, use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, fetch_size
CONTEXT: SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 'dummypw')"
PL/pgSQL function inline_code_block line 3 at EXECUTE
-- If we add a password for our user mapping instead, we should get a different
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index d6f79fc..afb80e6 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1885,8 +1885,9 @@ if (!triggered)
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
+ During hot standby, the parameters <varname>in_recovery</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify
the database, connections during hot standby will act much like any other
database connection. If failover or switchover occurs, the database will
switch to normal processing mode. Sessions will remain connected while the
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index f7b765f..d844979 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1811,18 +1811,84 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ <literal>standby</literal> may alternatively be specified as <literal>secondary</literal>.
+ <literal>prefer-standby</literal> may alternatively be specified as
+ <literal>prefer-secondary</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ The parameter options are interpreted based on the server version, and whether the
+ server supports read-write transactions or is in recovery mode.
+ If the server is version 14 or greater, the support of read-write transactions is
+ determined by the value of the <varname>transaction_read_only</varname>
+ configuration parameter that is reported by the server upon successful connection.
+ Otherwise if the server is prior to version 14, the query
+ <literal>SHOW transaction_read_only</literal> will be sent upon any successful
+ connection; if it returns <literal>on</literal>, it means the server doesn't support
+ read-write transactions. If the server is version 14 or greater, the recovery mode
+ state is determined by the value of the <varname>in_recovery</varname>
+ configuration parameter that is reported by the server upon successful connection.
+ If the server is prior to version 14, the parameter options don't consider the
+ recovery mode state.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-write</literal>, only a connection in which
+ read-write transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection
+ in which read-only transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then if the server is version 14
+ or greater, only a connection in which the server is not in recovery mode is considered
+ acceptable. Otherwise, if the server is prior to version 14, only a connection in which
+ read-write transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, then if the server is version 14 or
+ greater, only a connection in which the server is in recovery mode is considered acceptable.
+ Otherwise, if the server is prior to version 14, only a connection for which the server only
+ supports read-only transactions is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, then if the server is version
+ 14 or greater, a connection in which the server is in recovery mode is preferred. Otherwise,
+ if the server is prior to version 14, a connection for which the server only supports
+ read-only transactions is preferred. If no such connections can be found, then a connection
+ in which the server is not in recovery mode (server is version 14 or greater) or a
+ connection for which the server supports read-write transactions (server is prior to version
+ 14) will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
+ <varlistentry id="libpq-connect-target-server-type" xreflabel="target_server_type">
+ <term><literal>target_server_type</literal></term>
+ <listitem>
+ <para>
+ The purpose of this parameter is to reflect the similar PGJDBC <literal>targetServerType</literal>.
+ The supported options are a subset of those for <literal>target_session_attrs</literal>, namely
+ <literal>any</literal>, <literal>primary</literal>, <literal>secondary</literal> and
+ <literal>prefer-secondary</literal>. This parameter overrides any connection type specified by
+ <literal>target_session_attrs</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
</variablelist>
</para>
</sect2>
@@ -2130,14 +2196,18 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname>, and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname>
+ were not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
@@ -7237,6 +7307,16 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
+ <listitem>
+ <para>
+ <indexterm>
+ <primary><envar>PGTARGETSERVERTYPE</envar></primary>
+ </indexterm>
+ <envar>PGTARGETSERVERTYPE</envar> behaves the same as the <xref
+ linkend="libpq-connect-target-server-type"/> connection parameter.
+ </para>
+ </listitem>
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 8b00235..643d171 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,17 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname>, and
+ <varname>in_recovery</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> and <varname>in_recovery</varname> were not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 09c01ed..4129069 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7940,6 +7940,9 @@ StartupXLOG(void)
XLogCtl->SharedRecoveryState = RECOVERY_STATE_DONE;
SpinLockRelease(&XLogCtl->info_lck);
+ if (standbyState != STANDBY_DISABLED)
+ SendRecoveryExitSignal();
+
UpdateControlFile();
LWLockRelease(ControlFileLock);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 51f8099..babaf13 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -3691,6 +3691,34 @@ TerminateOtherDBBackends(Oid databaseId)
}
/*
+ * SendSignalToAllBackends --- send a signal to all backends.
+ */
+void
+SendSignalToAllBackends(ProcSignalReason reason)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ pid_t pid = 0;
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ volatile PGPROC *proc = &allProcs[pgprocno];
+ VirtualTransactionId procvxid;
+
+ GET_VXID_FROM_PGPROC(procvxid, *proc);
+
+ pid = proc->pid;
+ if (pid != 0)
+ (void) SendProcSignal(pid, reason, procvxid.backendId);
+ }
+
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
* ProcArraySetReplicationSlotXmin
*
* Install limits to future computations of the xmin horizon to prevent vacuum
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 4fa385b..957df0d 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -585,6 +585,9 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN))
RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);
+ if (CheckProcSignal(PROCSIG_RECOVERY_EXIT))
+ HandleRecoveryExitInterrupt();
+
SetLatch(MyLatch);
latch_sigusr1_handler();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 52b2809..347b32d 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -140,6 +140,15 @@ ShutdownRecoveryTransactionEnvironment(void)
VirtualXactLockTableCleanup();
}
+/*
+ * SendRecoveryExitSignal
+ * Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+ SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}
/*
* -----------------------------------------------------
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index c9424f1..9fa16ec 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -163,6 +163,15 @@ static bool RecoveryConflictPending = false;
static bool RecoveryConflictRetryable = true;
static ProcSignalReason RecoveryConflictReason;
+/*
+ * Inbound recovery exit are initially processed by
+ * HandleRecoveryExitInterrupt(), called from inside a signal handler.
+ * That just sets the recoveryExitInterruptPending flag and sets the process
+ * latch. ProcessRecoveryExitInterrupt() will then be called whenever it's
+ * safe to actually deal with the interrupt.
+ */
+volatile sig_atomic_t recoveryExitInterruptPending = false;
+
/* reused buffer to pass to SendRowDescriptionMessage() */
static MemoryContext row_description_context = NULL;
static StringInfoData row_description_buf;
@@ -191,6 +200,7 @@ static void drop_unnamed_stmt(void);
static void log_disconnections(int code, Datum arg);
static void enable_statement_timeout(void);
static void disable_statement_timeout(void);
+static void ProcessRecoveryExitInterrupt(void);
/* ----------------------------------------------------------------
@@ -539,6 +549,10 @@ ProcessClientReadInterrupt(bool blocked)
/* Process notify interrupts, if any */
if (notifyInterruptPending)
ProcessNotifyInterrupt();
+
+ /* Process recovery exit interrupts that happened while reading */
+ if (recoveryExitInterruptPending)
+ ProcessRecoveryExitInterrupt();
}
else if (ProcDiePending)
{
@@ -3009,6 +3023,52 @@ RecoveryConflictInterrupt(ProcSignalReason reason)
}
/*
+ * HandleRecoveryExitInterrupt
+ *
+ * Signal handler portion of interrupt handling. Let the backend know
+ * that the server has exited the recovery mode.
+ */
+void
+HandleRecoveryExitInterrupt(void)
+{
+ /*
+ * Note: this is called by a SIGNAL HANDLER. You must be very wary what
+ * you do here.
+ */
+
+ /* flag that work needs to be done */
+ recoveryExitInterruptPending = true;
+
+ /* make sure the event is processed in due course */
+ SetLatch(MyLatch);
+}
+
+/*
+ * ProcessRecoveryExitInterrupt
+ *
+ * This is called just after waiting for a frontend command. If a
+ * interrupt arrives (via HandleRecoveryExitInterrupt()) while reading,
+ * the read will be interrupted via the process's latch, and this routine
+ * will get called.
+*/
+static void
+ProcessRecoveryExitInterrupt(void)
+{
+ recoveryExitInterruptPending = false;
+
+ SetConfigOption("in_recovery",
+ "off",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
+ /*
+ * Flush output buffer so that clients receive the ParameterStatus message
+ * as soon as possible.
+ */
+ pq_flush();
+}
+
+
+/*
* ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
*
* If an interrupt condition is pending, and it's safe to service it,
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index d4ab4c7..ce2d0b9 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -646,10 +646,13 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
/*
* The postmaster already started the XLOG machinery, but we need to
* call InitXLOGAccess(), if the system isn't in hot-standby mode.
- * This is handled by calling RecoveryInProgress and ignoring the
- * result.
+ * This is handled by calling RecoveryInProgress.
*/
- (void) RecoveryInProgress();
+ if (RecoveryInProgress())
+ SetConfigOption("in_recovery",
+ "on",
+ PGC_INTERNAL, PGC_S_OVERRIDE);
+
}
else
{
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..a4ebcef 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_recovery"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index de87ad6..f33cb68 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -615,6 +615,7 @@ static char *recovery_target_string;
static char *recovery_target_xid_string;
static char *recovery_target_name_string;
static char *recovery_target_lsn_string;
+static bool in_recovery;
/* should be static, but commands/variable.c needs to get at this */
@@ -1618,7 +1619,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT, GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
@@ -1845,6 +1846,21 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ /*
+ * Not for general use --- used to indicate whether the instance is
+ * recovery mode
+ */
+ {"in_recovery", PGC_INTERNAL, UNGROUPED,
+ gettext_noop("Shows whether the instance is in recovery mode."),
+ NULL,
+ GUC_REPORT | GUC_NO_SHOW_ALL | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_recovery,
+ false,
+ NULL, NULL, NULL
+ },
+
+ {
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index ea8a876..d75db13 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -80,6 +80,7 @@ extern void CancelDBBackends(Oid databaseid, ProcSignalReason sigmode, bool conf
extern int CountUserBackends(Oid roleid);
extern bool CountOtherDBBackends(Oid databaseId,
int *nbackends, int *nprepared);
+extern void SendSignalToAllBackends(ProcSignalReason reason);
extern void TerminateOtherDBBackends(Oid databaseId);
extern void XidCacheRemoveRunningXids(TransactionId xid,
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 5cb3969..6c243d4 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -43,6 +43,8 @@ typedef enum
PROCSIG_RECOVERY_CONFLICT_BUFFERPIN,
PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,
+ PROCSIG_RECOVERY_EXIT, /* recovery exit interrupt */
+
NUM_PROCSIGNALS /* Must be last! */
} ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index faaf1d3..d5c822d 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int max_standby_streaming_delay;
extern void InitRecoveryTransactionEnvironment(void);
extern void ShutdownRecoveryTransactionEnvironment(void);
+extern void SendRecoveryExitSignal(void);
extern void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid,
RelFileNode node);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index bd30607..d29dd1f 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -68,6 +68,8 @@ extern void StatementCancelHandler(SIGNAL_ARGS);
extern void FloatExceptionHandler(SIGNAL_ARGS) pg_attribute_noreturn();
extern void RecoveryConflictInterrupt(ProcSignalReason reason); /* called from SIGUSR1
* handler */
+/* recovery exit interrupt handling function */
+extern void HandleRecoveryExitInterrupt(void);
extern void ProcessClientReadInterrupt(bool blocked);
extern void ProcessClientWriteInterrupt(bool blocked);
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7bee9dd..383d0d2 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -351,9 +351,14 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 17, /* sizeof("prefer-secondary") = 17 */
offsetof(struct pg_conn, target_session_attrs)},
+ {"target_server_type", "PGTARGETSERVERTYPE",
+ NULL, NULL,
+ "Target-Server-Type", "", 17, /* sizeof("prefer-secondary") = 17 */
+ offsetof(struct pg_conn, target_server_type)},
+
/* Terminating entry --- MUST BE LAST */
{NULL, NULL, NULL, NULL,
NULL, NULL, 0}
@@ -1001,6 +1006,58 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndGetTargetServerType
+ *
+ * Validate a given target_server_type option value and get the requested server type.
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "any") == 0)
+ *requestedServerType = SERVER_TYPE_ANY;
+ else if (strcmp(optionValue, "primary") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "prefer-secondary") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "secondary") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+
+ return true;
+}
+
+/*
+ * validateAndGetTargetServerTypeFromSessionAttrs
+ *
+ * Validate a given target_session_attrs value and get the requested server type.
+ * All valid target_server_type option values are also allowed in target_session_attrs
+ * (as a single option value).
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerTypeFromSessionAttrs(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (!validateAndGetTargetServerType(optionValue, requestedServerType))
+ {
+ if (strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_READ_WRITE;
+ else if (strcmp(optionValue, "read-only") == 0)
+ *requestedServerType = SERVER_TYPE_READ_ONLY;
+ else if (strcmp(optionValue, "prefer-standby") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+ }
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1396,19 +1453,36 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (!validateAndGetTargetServerTypeFromSessionAttrs(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid %s value: \"%s\"\n"),
- "target_settion_attrs",
+ "target_session_attrs",
conn->target_session_attrs);
return false;
}
}
/*
+ * Validate target_server_type option. If a target_server_type is
+ * specified, it overrides any target server type specified in
+ * target_session_attrs.
+ */
+ if (conn->target_server_type)
+ {
+ if (!validateAndGetTargetServerType(conn->target_server_type, &conn->requested_server_type))
+ {
+ conn->status = CONNECTION_BAD;
+ printfPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("invalid %s value: \"%s\"\n"),
+ "target_server_type",
+ conn->target_server_type);
+ return false;
+ }
+ }
+
+ /*
* Only if we get this far is it appropriate to try to connect. (We need a
* state flag, rather than just the boolean result of this function, in
* case someone tries to PQreset() the PGconn.)
@@ -2228,6 +2302,116 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (read-write/read-only).
+ * The connection state is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the read-write host-index
+ * hasn't been set, then it is set to the index of this connection's host, so
+ * that a connection to this host can be made again in the event that no
+ * connection to a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record read-write host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_or_rw_host == -1)
+ {
+ /*
+ * This can only happen if server version < 14 (for which standby
+ * is regarded as read-only)
+ */
+ conn->which_primary_or_rw_host = conn->whichhost;
+ }
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for recovery). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the primary host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedRecoveryConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in recovery mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_or_rw_host == -1)
+ conn->which_primary_or_rw_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2346,13 +2530,34 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_primary_or_rw_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to standby servers
+ * (or to read-only servers for server verson < 14) and should
+ * now try to re-connect to a previously-connected-to primary
+ * server (or read-write server for server version < 14),
+ * whose host index is recorded in which_primary_or_rw_host.
+ */
+ conn->whichhost = conn->which_primary_or_rw_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_primary_or_rw_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3559,39 +3764,129 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
- /*
- * If a read-write connection is required, see if we have one.
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but
- * by the same token they don't have any read-only mode, so we
- * may just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if (conn->requested_server_type != SERVER_TYPE_ANY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * Servers before 7.4 lack the transaction_read_only GUC, but
+ * by the same token they don't have any read-only mode, so we
+ * may just skip the test in that case.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->sversion >= 70400)
+ {
+ if (conn->sversion < 140000)
+ {
+ /*
+ * Save existing error messages across the
+ * PQsendQuery attempt. This is necessary because
+ * PQsendQuery is going to reset
+ * conn->errorMessage, so we would lose error
+ * messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->transaction_read_only &&
+ conn->requested_server_type == SERVER_TYPE_READ_WRITE) ||
+ (!conn->transaction_read_only &&
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * Server is read-only but requested read-write,
+ * or server is read-write but requested
+ * read-only, reject and continue to process any
+ * further hosts ...
+ */
+
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+ else if ((conn->in_recovery &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!conn->in_recovery &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in recovery but requested primary, or
+ * server is not in recovery but requested
+ * prefer-standby/standby, reject and continue to
+ * process any further hosts ...
+ */
+
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of
+ * the list of connections as it couldn't find
+ * any servers that are in recovery.
+ */
+ goto consume_checked_target_connection;
+ }
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ rejectCheckedRecoveryConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+ else if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ /*
+ * For servers before 7.4 (which don't support read-only), if
+ * the requested type of connection is prefer-standby, then
+ * record this host index and try other specified hosts before
+ * considering it later. If the requested type of connection
+ * is read-only or standby, ignore this connection.
+ */
+
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the list
+ * of connections as it couldn't find any servers that
+ * are read-only.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
+ {
+ if (conn->which_primary_or_rw_host == -1)
+ conn->which_primary_or_rw_host = conn->whichhost;
+ }
+
+ /*
+ * Try next host if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3663,6 +3958,7 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
return PGRES_POLLING_OK;
}
+
case CONNECTION_CHECK_WRITABLE:
{
const char *displayed_host;
@@ -3690,42 +3986,52 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is
+ * read-write (or requested server type is primary and
+ * server version < 14), ignore this connection. Server is
+ * read-write and requested type is read-only (or
+ * requested server type is standby and server version <
+ * 14), ignore this connection. Server is read-write
+ * (version < 14) and requested type is prefer-standby,
+ * record it for the first time and try to consume in the
+ * next scan (it means no read-only server is found in the
+ * first scan).
+ */
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_PRIMARY)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- /* Not writable; fail this connection. */
+ if (conn->which_primary_or_rw_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the
+ * list of connections as it couldn't find any
+ * servers that are read-only.
+ */
+ goto consume_checked_write_connection;
+ }
+
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
-
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
-
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
-
- /*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
- */
- conn->try_next_host = true;
+ rejectCheckedReadOrWriteConnection(conn);
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_write_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3903,10 +4209,14 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = false;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_primary_or_rw_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
@@ -4075,6 +4385,8 @@ freePGconn(PGconn *conn)
free(conn->rowBuf);
if (conn->target_session_attrs)
free(conn->target_session_attrs);
+ if (conn->target_server_type)
+ free(conn->target_server_type);
termPQExpBuffer(&conn->errorMessage);
termPQExpBuffer(&conn->workBuffer);
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237..0605b23 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1058,7 +1058,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
+ * Special hacks: remember client_encoding, transaction_read_only and
* standard_conforming_strings, and convert server version to a numeric
* form. We keep the first two of these in static variables as well, so
* that PQescapeString and PQescapeBytea can behave somewhat sanely (at
@@ -1112,6 +1112,14 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0);
+ }
+ else if (strcmp(name, "in_recovery") == 0)
+ {
+ conn->in_recovery = (strcmp(value, "on") == 0);
+ }
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1de91ae..e3bc73a 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,17 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_READ_WRITE, /* Read-write server */
+ SERVER_TYPE_READ_ONLY, /* Read-only server */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +381,30 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: "any", "read-write",
+ * "read-only", "primary", "prefer-standby" (or "prefer-secondary"),
+ * "standby" (or "secondary").
+ */
char *target_session_attrs;
+ /*
+ * Type of server to connect to. Possible values: "any", "primary",
+ * "prefer-secondary", "secondary". This overrides any connection type
+ * specified by target_session_attrs. This option supports a subset of the
+ * target_session_attrs option values, and its purpose is to closely
+ * reflect the similar PGJDBC targetServerType option. Note also that this
+ * option only accepts single option values, whereas in future,
+ * target_session_attrs may accept multiple session attribute values.
+ */
+ char *target_server_type;
+
+ /*
+ * The requested server type, derived from target_session_attrs /
+ * target_server_type.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +438,24 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first primary host (or read-write host if server version
+ * < 14) encountered (if any) in the connection string. This is used
+ * during processing of requested server connection type
+ * SERVER_TYPE_PREFER_STANDBY.
+ *
+ * The initial value is -1, indicating that no primary host has yet been
+ * found. It is then set to the index of the first primary host, if one is
+ * found in the connection string during processing. If a second
+ * connection attempt is later made to that primary host (because no
+ * connection to a standby server could be made), which_primary_or_rw_host
+ * is then set to -2 to avoid recursion during subsequent processing (and
+ * whichhost is set to the primary host index). Note that for server
+ * versions < 14, a requested type of "primary" is regarded as
+ * "read-write" and "standby" is regarded as "read-only".
+ */
+ int which_primary_or_rw_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +486,8 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ bool transaction_read_only; /* transaction_read_only GUC report variable state */
+ bool in_recovery; /* in_recovery GUC report variable state */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..fb05494 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 66;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -85,7 +85,7 @@ sub test_target_session_attrs
my $node2_port = $node2->port;
my $node2_name = $node2->name;
- my $target_name = $target_node->name;
+ my $target_name = $target_node->name if (defined $target_node);
# Build connection string for connection attempt.
my $connstr = "host=$node1_host,$node2_host ";
@@ -97,10 +97,25 @@ sub test_target_session_attrs
my ($ret, $stdout, $stderr) =
$node1->psql('postgres', 'SHOW port;',
extra_params => [ '-d', $connstr ]);
- is( $status == $ret && $stdout eq $target_node->port,
- 1,
- "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
- );
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
return;
}
@@ -121,6 +136,181 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to standby1 in "read-only" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "read-only", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to primary in "prefer-secondary" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "secondary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "secondary", 0);
+
+# Fail to connect in "read-write" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "read-write", 2);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "read-only" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "read-only", 2);
+
+# Fail to connect in "standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "standby", 2);
+
+# Fail to connect in "secondary" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "secondary", 2);
+
+# Tests for connection parameter target_server_type
+note "testing connection parameter \"target_server_type\"";
+
+# Routine designed to run tests on the connection parameter
+# target_server_type with multiple nodes.
+sub test_target_server_type
+{
+ my $node1 = shift;
+ my $node2 = shift;
+ my $target_node = shift;
+ my $mode = shift;
+ my $status = shift;
+
+ my $node1_host = $node1->host;
+ my $node1_port = $node1->port;
+ my $node1_name = $node1->name;
+ my $node2_host = $node2->host;
+ my $node2_port = $node2->port;
+ my $node2_name = $node2->name;
+
+ my $target_name = $target_node->name if (defined $target_node);
+
+ # Build connection string for connection attempt.
+ my $connstr = "host=$node1_host,$node2_host ";
+ $connstr .= "port=$node1_port,$node2_port ";
+ $connstr .= "target_server_type=$mode";
+
+ # The client used for the connection does not matter, only the backend
+ # point does.
+ my ($ret, $stdout, $stderr) =
+ $node1->psql('postgres', 'SHOW port;',
+ extra_params => [ '-d', $connstr ]);
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+
+ return;
+}
+
+# Connect to primary in "any" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_primary, "any",
+ 0);
+
+# Connect to standby1 in "any" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "any", 0);
+
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "prefer-secondary" mode with primary,primary list.
+test_target_server_type($node_primary, $node_primary, $node_primary,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "prefer-secondary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-secondary", 0);
+
+# Connect to standby1 in "secondary" mode with primary,standby1 list.
+test_target_server_type($node_primary, $node_standby_1, $node_standby_1,
+ "secondary", 0);
+
+# Connect to standby1 in "secondary" mode with standby1,primary list.
+test_target_server_type($node_standby_1, $node_primary, $node_standby_1,
+ "secondary", 0);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_server_type($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "secondary" mode with primary,primary list.
+test_target_server_type($node_primary, $node_primary, undef,
+ "secondary", 2);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
Hi Greg,
Thanks for the further review, an updated patch is attached. Please
see my responses to your comments below:
Thanks for addressing all of my previous review comments in your new v19 patch.
Everything looks good to me now, so I am marking this as "ready for committer".
Kind Regards,
Peter Smith.
Fujitsu Australia
Greg Nancarrow <gregn4422@gmail.com> writes:
[ v19-0001-Enhance-libpq-target_session_attrs-and-add-target_se.patch ]
I started to look through this, and I find that I'm really pretty
disappointed in the direction the patch has gone of late. I think
there is no defensible reason for the choices that have been made
to have different behavior for v14-and-up servers than for older
servers. It's not necessary and it complicates life for users.
We can use pg_is_in_recovery() on every server version that has
hot standby at all, so there is no reason not to treat the
GUC_REPORT indicator as an optimization that lets us skip a
separate inquiry transaction, rather than something we have to
have to make the feature work correctly.
So I think what we ought to have is the existing read-write
vs read-only distinction, implemented as it is now by checking
"SHOW transaction_read_only" if the server fails to send that
as a GUC_REPORT value; and orthogonally to that, a primary/standby
distinction implemented by checking pg_is_in_recovery(), again
with a fast path if we got a ParameterStatus report.
I do not like the addition of target_server_type. It seems
unnecessary and confusing, particularly because you've had to
make a completely arbitrary decision about how it interacts with
target_session_attrs when both are specified. I think the
justification that "it's more like JDBC" is risible. Any user of
this will be writing C not Java.
A couple of other thoughts:
* Could we call the GUC "in_hot_standby" rather than "in_recovery"?
I think "recovery" is a poorly chosen legacy term that we ought to
avoid exposing to users more than we already have. We're stuck
with pg_is_in_recovery() I suppose, but let's not double down on
bad decisions.
* I don't think you really need a hard-wired test on v14-or-not
in the libpq code. The internal state about read-only and
hot-standby ought to be "yes", "no", or "unknown", starting in
the latter state. Receipt of ParameterStatus changes it from
"unknown" to one of the other two states. If we need to know
the value, and it's still "unknown", then we send a probe query.
We still need hard-coded version checks to know if the probe
query is safe, but those version breaks are far enough back to
be pretty well set in stone. (In the back of my mind here is
that people might well choose to back-port the GUC_REPORT marking
of transaction_read_only, and maybe even the other GUC if they
were feeling ambitious. So not having a hard-coded version
assumption where we don't particularly need it seems a good thing.)
* This can't be right can it? Too many commas.
@@ -1618,7 +1619,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT, GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
(The compiler will fail to bitch about that unfortunately, since
there are more struct fields that we leave uninitialized normally.)
BTW, I think it would be worth splitting this into separate server-side
and libpq patches. It looked to me like the server side is pretty
nearly committable, modulo bikeshedding about the new GUC name. We could
get that out of the way and then have a much smaller libpq patch to argue
about.
regards, tom lane
I wrote:
BTW, I think it would be worth splitting this into separate server-side
and libpq patches. It looked to me like the server side is pretty
nearly committable, modulo bikeshedding about the new GUC name.
Actually ... I looked that over again and got a bit more queasy about
all the new signaling logic it is adding. Signals are inherently
bug-prone stuff, plus it's not very clear what sort of guarantees
we'd have about either the reliability or the timeliness of client
notifications about exiting hot-standby mode.
I also wonder what consideration has been given to the performance
implications of marking transaction_read_only as GUC_REPORT, thus
causing client traffic to occur every time it's changed. Most of
the current GUC_REPORT variables don't change too often in typical
sessions, but I'm less convinced about that for transaction_read_only.
So I thought about alternative ways to implement this, and realized
that it would not be hard to make guc.c handle it all by itself, if
we use a custom show-hook for the in_hot_standby GUC that calls
RecoveryInProgress() instead of examining purely static state.
Now, by itself that idea only takes care of the session-start-time
report, because there'd never be any GUC action causing a new
report to occur. But we can improve the situation if we get rid
of the current design whereby ReportGUCOption() is called immediately
when any GUC value changes, and instead batch up the reports to
occur when we're about to go idle waiting for a new client query.
Not incidentally, this responds to a concern Robert mentioned awhile
back about the performance of GUC reporting [1]/messages/by-id/CA+TgmoaDoVtMnfKNFm-iyyCSp=FPiHkfU1AXuEHJqmcLTAX6kQ@mail.gmail.com. You can already get
the server to spam the client excessively if any GUC_REPORT variables
are changed by, for example, functions' SET clauses, because that could
lead to the active value changing many times within a query. We've
gotten away with that so far, but it'd be a problem if any more-often-
changed variables get marked GUC_REPORT. (I actually have a vague
memory of other complaints about that, but I couldn't find any in a
desultory search of the archives.)
So I present 0001 attached which changes the GUC_REPORT code to work
that way, and then 0002 is a pretty small hack to add a reportable
in_hot_standby GUC by having the end-of-query function check (when
needed) to see if the active value changed.
As it stands, 0001 reduces the ParameterStatus message traffic to
at most one per GUC per query, but it doesn't attempt to eliminate
duplicate ParameterStatus messages altogether. We could do that
as a pretty simple adjustment if we're willing to expend the storage
to remember the last value sent to the client. It might be worth
doing, since for example the function-SET-clause case would typically
lead to no net change in the GUC's value by the end of the query.
An objection that could be raised to this approach for in_hot_standby
is that it will only report in_hot_standby becoming false at the end
of a query, whereas the v19 patch at least attempts to deliver an
async ParameterStatus message when the client is idle (and, I think,
indeed may fail to provide *any* message if the transition occurs
when it isn't idle). I don't find that too compelling though;
libpq-based clients, at least, don't have any very practical way to
deal with async ParameterStatus messages anyway.
(Note that I did not touch the docs here, so that while 0001 might
be committable as-is, 0002 is certainly just WIP.)
BTW, as far as the transaction_read_only side of things goes, IMO
it would make a lot more sense to mark default_transaction_read_only
as GUC_REPORT, since that changes a lot less frequently. We'd then
have to expend some work to report that value honestly, since right
now the hot-standby code cheats by ignoring the GUC's value during
hot standby. But I think a technique much like 0002's would work
for that.
Thoughts?
regards, tom lane
[1]: /messages/by-id/CA+TgmoaDoVtMnfKNFm-iyyCSp=FPiHkfU1AXuEHJqmcLTAX6kQ@mail.gmail.com
Attachments:
0001-report-guc-changes-at-query-end.patchtext/x-diff; charset=us-ascii; name=0001-report-guc-changes-at-query-end.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 411cfadbff..b67cc2f375 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4233,6 +4233,9 @@ PostgresMain(int argc, char *argv[],
pgstat_report_activity(STATE_IDLE, NULL);
}
+ /* Report any recently-changed GUC options */
+ ReportChangedGUCOptions();
+
ReadyForQuery(whereToSendOutput);
send_ready_for_query = false;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 596bcb7b84..ddfc7ea05d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -4822,6 +4822,8 @@ static bool guc_dirty; /* true if need to do commit/abort work */
static bool reporting_enabled; /* true to enable GUC_REPORT */
+static bool report_needed; /* true if any GUC_REPORT reports are needed */
+
static int GUCNestLevel = 0; /* 1 when in main transaction */
@@ -5828,7 +5830,10 @@ ResetAllOptions(void)
gconf->scontext = gconf->reset_scontext;
if (gconf->flags & GUC_REPORT)
- ReportGUCOption(gconf);
+ {
+ gconf->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ }
}
}
@@ -6215,7 +6220,10 @@ AtEOXact_GUC(bool isCommit, int nestLevel)
/* Report new value if we changed it */
if (changed && (gconf->flags & GUC_REPORT))
- ReportGUCOption(gconf);
+ {
+ gconf->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ }
} /* end of stack-popping loop */
if (stack != NULL)
@@ -6257,26 +6265,64 @@ BeginReportingGUCOptions(void)
if (conf->flags & GUC_REPORT)
ReportGUCOption(conf);
}
+
+ report_needed = false;
}
/*
- * ReportGUCOption: if appropriate, transmit option value to frontend
+ * Report recently-changed GUC_REPORT variables.
+ * This is called just before we wait for a new client query.
+ *
+ * By handling things this way, we ensure that a ParameterStatus message
+ * is sent at most once per variable per query, even if the variable
+ * changed multiple times within the query. That's quite possible when
+ * using features such as function SET clauses. We do not, however, go to
+ * the length of trying to suppress sending anything when the variable was
+ * changed and then reverted to its original value.
+ */
+void
+ReportChangedGUCOptions(void)
+{
+ /* Quick exit if not (yet) enabled */
+ if (!reporting_enabled)
+ return;
+
+ /* Quick exit if no values have been changed */
+ if (!report_needed)
+ return;
+
+ /* Transmit new values of interesting variables */
+ for (int i = 0; i < num_guc_variables; i++)
+ {
+ struct config_generic *conf = guc_variables[i];
+
+ if ((conf->flags & GUC_REPORT) && (conf->status & GUC_NEEDS_REPORT))
+ ReportGUCOption(conf);
+ }
+
+ report_needed = false;
+}
+
+/*
+ * ReportGUCOption: transmit option value to frontend
+ *
+ * Caller is now fully responsible for deciding whether this should be done.
+ * However, we do clear the NEEDS_REPORT flag here.
*/
static void
ReportGUCOption(struct config_generic *record)
{
- if (reporting_enabled && (record->flags & GUC_REPORT))
- {
- char *val = _ShowOption(record, false);
- StringInfoData msgbuf;
+ char *val = _ShowOption(record, false);
+ StringInfoData msgbuf;
- pq_beginmessage(&msgbuf, 'S');
- pq_sendstring(&msgbuf, record->name);
- pq_sendstring(&msgbuf, val);
- pq_endmessage(&msgbuf);
+ pq_beginmessage(&msgbuf, 'S');
+ pq_sendstring(&msgbuf, record->name);
+ pq_sendstring(&msgbuf, val);
+ pq_endmessage(&msgbuf);
- pfree(val);
- }
+ pfree(val);
+
+ record->status &= ~GUC_NEEDS_REPORT;
}
/*
@@ -7667,7 +7713,10 @@ set_config_option(const char *name, const char *value,
}
if (changeVal && (record->flags & GUC_REPORT))
- ReportGUCOption(record);
+ {
+ record->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ }
return changeVal ? 1 : -1;
}
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 2819282181..76236fb0c0 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -362,6 +362,7 @@ extern void AtStart_GUC(void);
extern int NewGUCNestLevel(void);
extern void AtEOXact_GUC(bool isCommit, int nestLevel);
extern void BeginReportingGUCOptions(void);
+extern void ReportChangedGUCOptions(void);
extern void ParseLongOption(const char *string, char **name, char **value);
extern bool parse_int(const char *value, int *result, int flags,
const char **hintmsg);
diff --git a/src/include/utils/guc_tables.h b/src/include/utils/guc_tables.h
index 04431d0eb2..a29c2b01b4 100644
--- a/src/include/utils/guc_tables.h
+++ b/src/include/utils/guc_tables.h
@@ -172,7 +172,8 @@ struct config_generic
* Caution: the GUC_IS_IN_FILE bit is transient state for ProcessConfigFile.
* Do not assume that its value represents useful information elsewhere.
*/
-#define GUC_PENDING_RESTART 0x0002
+#define GUC_PENDING_RESTART 0x0002 /* changed value cannot be applied yet */
+#define GUC_NEEDS_REPORT 0x0004 /* new value must be reported to client */
/* GUC records for specific variable types */
0002-implement-in-hot-standby-GUC.patchtext/x-diff; charset=us-ascii; name=0002-implement-in-hot-standby-GUC.patchDownload
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index ddfc7ea05d..207dc9bf4d 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -209,6 +209,7 @@ static bool check_cluster_name(char **newval, void **extra, GucSource source);
static const char *show_unix_socket_permissions(void);
static const char *show_log_file_mode(void);
static const char *show_data_directory_mode(void);
+static const char *show_in_hot_standby(void);
static bool check_backtrace_functions(char **newval, void **extra, GucSource source);
static void assign_backtrace_functions(const char *newval, void *extra);
static bool check_recovery_target_timeline(char **newval, void **extra, GucSource source);
@@ -607,6 +608,8 @@ static int max_identifier_length;
static int block_size;
static int segment_size;
static int wal_block_size;
+static bool in_hot_standby;
+static bool last_reported_in_hot_standby;
static bool data_checksums;
static bool integer_datetimes;
static bool assert_enabled;
@@ -1844,6 +1847,17 @@ static struct config_bool ConfigureNamesBool[] =
NULL, NULL, NULL
},
+ {
+ {"in_hot_standby", PGC_INTERNAL, PRESET_OPTIONS,
+ gettext_noop("Shows whether hot standby is currently active."),
+ NULL,
+ GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_hot_standby,
+ false,
+ NULL, NULL, show_in_hot_standby
+ },
+
{
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
@@ -6266,6 +6280,9 @@ BeginReportingGUCOptions(void)
ReportGUCOption(conf);
}
+ /* Hack for in_hot_standby: remember the value we just sent */
+ last_reported_in_hot_standby = in_hot_standby;
+
report_needed = false;
}
@@ -6287,6 +6304,23 @@ ReportChangedGUCOptions(void)
if (!reporting_enabled)
return;
+ /*
+ * Since in_hot_standby isn't actually changed by normal GUC actions, we
+ * need a hack to check whether a new value needs to be reported to the
+ * client. For speed, we rely on the assumption that it can never
+ * transition from false to true.
+ */
+ if (last_reported_in_hot_standby && !RecoveryInProgress())
+ {
+ struct config_generic *record;
+
+ record = find_option("in_hot_standby", false, ERROR);
+ Assert(record != NULL);
+ record->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ last_reported_in_hot_standby = false;
+ }
+
/* Quick exit if no values have been changed */
if (!report_needed)
return;
@@ -11738,6 +11772,18 @@ show_data_directory_mode(void)
return buf;
}
+static const char *
+show_in_hot_standby(void)
+{
+ /*
+ * Unlike most show hooks, this has a side-effect of updating the dummy
+ * GUC variable to contain the value last shown. See confederate code in
+ * BeginReportingGUCOptions and ReportChangedGUCOptions.
+ */
+ in_hot_standby = RecoveryInProgress();
+ return in_hot_standby ? "on" : "off";
+}
+
/*
* We split the input string, where commas separate function names
* and certain whitespace chars are ignored, into a \0-separated (and
On Sun, Sep 27, 2020 at 4:34 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thoughts?
Thanks for your thoughts, patches and all the pointers.
I'll be looking at all of them.
(And yes, the comma instead of bitwise OR is of course an error,
somehow made and gone unnoticed; the next field in the struct is an
enum, so accepts any int value).
Regards,
Greg Nancarrow
Fujitsu Australia
On 30.09.2020 10:57, Greg Nancarrow wrote:
Thanks for your thoughts, patches and all the pointers.
I'll be looking at all of them.
(And yes, the comma instead of bitwise OR is of course an error,
somehow made and gone unnoticed; the next field in the struct is an
enum, so accepts any int value).Regards,
Greg Nancarrow
Fujitsu Australia
CFM reminder.
Hi, this entry is "Waiting on Author" and the thread was inactive for a
while. As far as I see, the patch needs some further work.
Are you going to continue working on it, or should I mark it as
"returned with feedback" until a better time?
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Anastasia Lubennikova <a.lubennikova@postgrespro.ru> writes:
Hi, this entry is "Waiting on Author" and the thread was inactive for a
while. As far as I see, the patch needs some further work.
Are you going to continue working on it, or should I mark it as
"returned with feedback" until a better time?
I'm inclined to go ahead and commit the 0001 patch I posted at [1]/messages/by-id/5708.1601145259@sss.pgh.pa.us
(ie, change the implementation of GUC_REPORT to avoid intra-query
reports), since that addresses a performance problem that's
independent of the goal here. The rest of this seems to still
be in Greg's court.
Has anyone got an opinion about the further improvement I suggested:
As it stands, 0001 reduces the ParameterStatus message traffic to
at most one per GUC per query, but it doesn't attempt to eliminate
duplicate ParameterStatus messages altogether. We could do that
as a pretty simple adjustment if we're willing to expend the storage
to remember the last value sent to the client. It might be worth
doing, since for example the function-SET-clause case would typically
lead to no net change in the GUC's value by the end of the query.
On reflection this seems worth doing, since excess client traffic
is far from free.
regards, tom lane
On 2020-Nov-24, Tom Lane wrote:
I'm inclined to go ahead and commit the 0001 patch I posted at [1]
(ie, change the implementation of GUC_REPORT to avoid intra-query
reports), since that addresses a performance problem that's
independent of the goal here. The rest of this seems to still
be in Greg's court.
Sounded a good idea to me.
Has anyone got an opinion about the further improvement I suggested:
As it stands, 0001 reduces the ParameterStatus message traffic to
at most one per GUC per query, but it doesn't attempt to eliminate
duplicate ParameterStatus messages altogether. We could do that
as a pretty simple adjustment if we're willing to expend the storage
to remember the last value sent to the client. It might be worth
doing, since for example the function-SET-clause case would typically
lead to no net change in the GUC's value by the end of the query.On reflection this seems worth doing, since excess client traffic
is far from free.
Agreed. If this is just a few hundred bytes of server-side local memory
per backend, it seems definitely worth it.
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2020-Nov-24, Tom Lane wrote:
As it stands, 0001 reduces the ParameterStatus message traffic to
at most one per GUC per query, but it doesn't attempt to eliminate
duplicate ParameterStatus messages altogether. We could do that
as a pretty simple adjustment if we're willing to expend the storage
to remember the last value sent to the client. It might be worth
doing, since for example the function-SET-clause case would typically
lead to no net change in the GUC's value by the end of the query.
On reflection this seems worth doing, since excess client traffic
is far from free.
Agreed. If this is just a few hundred bytes of server-side local memory
per backend, it seems definitely worth it.
Yeah, given the current set of GUC_REPORT variables, it's hard to see
the storage for their last-reported values amounting to much. The need
for an extra pointer field in each GUC variable record might eat more
space than the actually-live values :-(
regards, tom lane
I wrote:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
Agreed. If this is just a few hundred bytes of server-side local memory
per backend, it seems definitely worth it.
Yeah, given the current set of GUC_REPORT variables, it's hard to see
the storage for their last-reported values amounting to much. The need
for an extra pointer field in each GUC variable record might eat more
space than the actually-live values :-(
Here's a v2 that does it like that.
regards, tom lane
Attachments:
0001-report-guc-changes-at-query-end-2.patchtext/x-diff; charset=us-ascii; name=0001-report-guc-changes-at-query-end-2.patchDownload
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7c5f7c775b..34ed0e7558 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -4229,6 +4229,9 @@ PostgresMain(int argc, char *argv[],
pgstat_report_activity(STATE_IDLE, NULL);
}
+ /* Report any recently-changed GUC options */
+ ReportChangedGUCOptions();
+
ReadyForQuery(whereToSendOutput);
send_ready_for_query = false;
}
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index bb34630e8e..245a3472bc 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -4822,6 +4822,8 @@ static bool guc_dirty; /* true if need to do commit/abort work */
static bool reporting_enabled; /* true to enable GUC_REPORT */
+static bool report_needed; /* true if any GUC_REPORT reports are needed */
+
static int GUCNestLevel = 0; /* 1 when in main transaction */
@@ -5452,6 +5454,7 @@ InitializeOneGUCOption(struct config_generic *gconf)
gconf->reset_scontext = PGC_INTERNAL;
gconf->stack = NULL;
gconf->extra = NULL;
+ gconf->last_reported = NULL;
gconf->sourcefile = NULL;
gconf->sourceline = 0;
@@ -5828,7 +5831,10 @@ ResetAllOptions(void)
gconf->scontext = gconf->reset_scontext;
if (gconf->flags & GUC_REPORT)
- ReportGUCOption(gconf);
+ {
+ gconf->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ }
}
}
@@ -6215,7 +6221,10 @@ AtEOXact_GUC(bool isCommit, int nestLevel)
/* Report new value if we changed it */
if (changed && (gconf->flags & GUC_REPORT))
- ReportGUCOption(gconf);
+ {
+ gconf->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ }
} /* end of stack-popping loop */
if (stack != NULL)
@@ -6257,17 +6266,60 @@ BeginReportingGUCOptions(void)
if (conf->flags & GUC_REPORT)
ReportGUCOption(conf);
}
+
+ report_needed = false;
+}
+
+/*
+ * ReportChangedGUCOptions: report recently-changed GUC_REPORT variables
+ *
+ * This is called just before we wait for a new client query.
+ *
+ * By handling things this way, we ensure that a ParameterStatus message
+ * is sent at most once per variable per query, even if the variable
+ * changed multiple times within the query. That's quite possible when
+ * using features such as function SET clauses. Function SET clauses
+ * also tend to cause values to change intraquery but eventually revert
+ * to their prevailing values; ReportGUCOption is responsible for avoiding
+ * redundant reports in such cases.
+ */
+void
+ReportChangedGUCOptions(void)
+{
+ /* Quick exit if not (yet) enabled */
+ if (!reporting_enabled)
+ return;
+
+ /* Quick exit if no values have been changed */
+ if (!report_needed)
+ return;
+
+ /* Transmit new values of interesting variables */
+ for (int i = 0; i < num_guc_variables; i++)
+ {
+ struct config_generic *conf = guc_variables[i];
+
+ if ((conf->flags & GUC_REPORT) && (conf->status & GUC_NEEDS_REPORT))
+ ReportGUCOption(conf);
+ }
+
+ report_needed = false;
}
/*
* ReportGUCOption: if appropriate, transmit option value to frontend
+ *
+ * We need not transmit the value if it's the same as what we last
+ * transmitted. However, clear the NEEDS_REPORT flag in any case.
*/
static void
ReportGUCOption(struct config_generic *record)
{
- if (reporting_enabled && (record->flags & GUC_REPORT))
+ char *val = _ShowOption(record, false);
+
+ if (record->last_reported == NULL ||
+ strcmp(val, record->last_reported) != 0)
{
- char *val = _ShowOption(record, false);
StringInfoData msgbuf;
pq_beginmessage(&msgbuf, 'S');
@@ -6275,8 +6327,19 @@ ReportGUCOption(struct config_generic *record)
pq_sendstring(&msgbuf, val);
pq_endmessage(&msgbuf);
- pfree(val);
+ /*
+ * We need a long-lifespan copy. If strdup() fails due to OOM, we'll
+ * set last_reported to NULL and thereby possibly make a duplicate
+ * report later.
+ */
+ if (record->last_reported)
+ free(record->last_reported);
+ record->last_reported = strdup(val);
}
+
+ pfree(val);
+
+ record->status &= ~GUC_NEEDS_REPORT;
}
/*
@@ -7695,7 +7758,10 @@ set_config_option(const char *name, const char *value,
}
if (changeVal && (record->flags & GUC_REPORT))
- ReportGUCOption(record);
+ {
+ record->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ }
return changeVal ? 1 : -1;
}
diff --git a/src/include/utils/guc.h b/src/include/utils/guc.h
index 073c8f3e06..6a20a3bcec 100644
--- a/src/include/utils/guc.h
+++ b/src/include/utils/guc.h
@@ -363,6 +363,7 @@ extern void AtStart_GUC(void);
extern int NewGUCNestLevel(void);
extern void AtEOXact_GUC(bool isCommit, int nestLevel);
extern void BeginReportingGUCOptions(void);
+extern void ReportChangedGUCOptions(void);
extern void ParseLongOption(const char *string, char **name, char **value);
extern bool parse_int(const char *value, int *result, int flags,
const char **hintmsg);
diff --git a/src/include/utils/guc_tables.h b/src/include/utils/guc_tables.h
index 04431d0eb2..7f36e1146f 100644
--- a/src/include/utils/guc_tables.h
+++ b/src/include/utils/guc_tables.h
@@ -161,6 +161,8 @@ struct config_generic
GucContext reset_scontext; /* context that set the reset value */
GucStack *stack; /* stacked prior values */
void *extra; /* "extra" pointer for current actual value */
+ char *last_reported; /* if variable is GUC_REPORT, value last sent
+ * to client (NULL if not yet sent) */
char *sourcefile; /* file current setting is from (NULL if not
* set in config file) */
int sourceline; /* line in source file */
@@ -172,7 +174,8 @@ struct config_generic
* Caution: the GUC_IS_IN_FILE bit is transient state for ProcessConfigFile.
* Do not assume that its value represents useful information elsewhere.
*/
-#define GUC_PENDING_RESTART 0x0002
+#define GUC_PENDING_RESTART 0x0002 /* changed value cannot be applied yet */
+#define GUC_NEEDS_REPORT 0x0004 /* new value must be reported to client */
/* GUC records for specific variable types */
On Wed, Nov 25, 2020 at 12:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Here's a v2 that does it like that.
Looks OK to me.
Regards,
Greg Nancarrow
Fujitsu Australia
Greg Nancarrow <gregn4422@gmail.com> writes:
On Wed, Nov 25, 2020 at 12:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Here's a v2 that does it like that.
Looks OK to me.
Thanks for looking! Pushed.
At this point the cfbot is going to start complaining that the last-posted
patch in this thread no longer applies. Unless you have a new patch set
nearly ready to post, I think we should close the CF entry as RWF, and
then you can open a new one when you're ready.
regards, tom lane
On Thu, Nov 26, 2020 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thanks for looking! Pushed.
At this point the cfbot is going to start complaining that the last-posted
patch in this thread no longer applies. Unless you have a new patch set
nearly ready to post, I think we should close the CF entry as RWF, and
then you can open a new one when you're ready.
Actually, the cfbot shouldn't be complaining, as my last-posted patch
still seems to apply cleanly on the latest code (with your pushed
patch), and all tests pass.
Hopefully it's OK to let it roll over to the next CF with the WOA status.
I am actively working on processing the feedback and updating the
patch, and hope to post an update next week sometime.
Regards,
Greg Nancarrow
Fujitsu Australia
On Thu, Nov 26, 2020 at 11:07 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Thu, Nov 26, 2020 at 3:43 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Thanks for looking! Pushed.
At this point the cfbot is going to start complaining that the last-posted
patch in this thread no longer applies. Unless you have a new patch set
nearly ready to post, I think we should close the CF entry as RWF, and
then you can open a new one when you're ready.Actually, the cfbot shouldn't be complaining, as my last-posted patch
still seems to apply cleanly on the latest code (with your pushed
patch), and all tests pass.
Hopefully it's OK to let it roll over to the next CF with the WOA status.
I am actively working on processing the feedback and updating the
patch, and hope to post an update next week sometime.
Posting an updated set of patches.
Regards,
Greg Nancarrow
Fujitsu Australia
Attachments:
v20-0001-in_hot_standby-and-transaction_read_only-reportable-GUCs.patchapplication/octet-stream; name=v20-0001-in_hot_standby-and-transaction_read_only-reportable-GUCs.patchDownload
From cfc70e656d7226d7f6050a99b33eeb9d09841584 Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Tue, 1 Dec 2020 14:17:38 +1100
Subject: [PATCH v20 1/2] Add "in_hot_standby" reportable GUC and make
"transaction_read_only" GUC reportable.
Add "in_hot_standby" as a GUC_REPORT variable, to indicate to clients when
hot-standby is currently active, by having the end-of-query function check (when
needed) to see if the active hot-standby value changed. (Implementation: Tom Lane)
Also, enhance "transaction_read_only" to be a GUC_REPORT variable, for client
connections to read-only/read-write servers.
Making these GUC variables reportable avoids having to execute a query
post-connection in order to determine whether a host is in hot-standby mode or
is read-write (and reduces time to make the connection).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/high-availability.sgml | 11 +++++----
src/backend/utils/misc/check_guc | 2 +-
src/backend/utils/misc/guc.c | 48 ++++++++++++++++++++++++++++++++++++-
3 files changed, 54 insertions(+), 7 deletions(-)
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 19d7bd2..1f5aef9 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1848,8 +1848,9 @@ if (!triggered)
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
+ During hot standby, the parameters <varname>in_hot_standby</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify
the database, connections during hot standby will act much like any other
database connection. If failover or switchover occurs, the database will
switch to normal processing mode. Sessions will remain connected while the
@@ -1859,9 +1860,9 @@ if (!triggered)
</para>
<para>
- Users will be able to tell whether their session is read-only by
- issuing <command>SHOW transaction_read_only</command>. In addition, a set of
- functions (<xref linkend="functions-recovery-info-table"/>) allow users to
+ Users will be able to tell whether hot standby is currently active for their
+ session by issuing <command>SHOW in_hot_standby</command>. In addition, a set
+ of functions (<xref linkend="functions-recovery-info-table"/>) allow users to
access information about the standby server. These allow you to write
programs that are aware of the current state of the database. These
can be used to monitor the progress of recovery, or to allow you to
diff --git a/src/backend/utils/misc/check_guc b/src/backend/utils/misc/check_guc
index 416a087..5e7a54a 100755
--- a/src/backend/utils/misc/check_guc
+++ b/src/backend/utils/misc/check_guc
@@ -21,7 +21,7 @@ is_superuser lc_collate lc_ctype lc_messages lc_monetary lc_numeric lc_time \
pre_auth_delay role seed server_encoding server_version server_version_num \
session_authorization trace_lock_oidmin trace_lock_table trace_locks trace_lwlocks \
trace_notify trace_userlocks transaction_isolation transaction_read_only \
-zero_damaged_pages"
+zero_damaged_pages in_hot_standby"
### What options are listed in postgresql.conf.sample, but don't appear
### in guc.c?
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 245a347..311474e 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -209,6 +209,7 @@ static bool check_cluster_name(char **newval, void **extra, GucSource source);
static const char *show_unix_socket_permissions(void);
static const char *show_log_file_mode(void);
static const char *show_data_directory_mode(void);
+static const char *show_in_hot_standby(void);
static bool check_backtrace_functions(char **newval, void **extra, GucSource source);
static void assign_backtrace_functions(const char *newval, void *extra);
static bool check_recovery_target_timeline(char **newval, void **extra, GucSource source);
@@ -607,6 +608,8 @@ static int max_identifier_length;
static int block_size;
static int segment_size;
static int wal_block_size;
+static bool in_hot_standby;
+static bool last_reported_in_hot_standby;
static bool data_checksums;
static bool integer_datetimes;
static bool assert_enabled;
@@ -1618,7 +1621,7 @@ static struct config_bool ConfigureNamesBool[] =
{"transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the current transaction's read-only status."),
NULL,
- GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ GUC_REPORT | GUC_NO_RESET_ALL | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
},
&XactReadOnly,
false,
@@ -1845,6 +1848,17 @@ static struct config_bool ConfigureNamesBool[] =
},
{
+ {"in_hot_standby", PGC_INTERNAL, PRESET_OPTIONS,
+ gettext_noop("Shows whether hot standby is currently active."),
+ NULL,
+ GUC_REPORT | GUC_NOT_IN_SAMPLE | GUC_DISALLOW_IN_FILE
+ },
+ &in_hot_standby,
+ false,
+ NULL, NULL, show_in_hot_standby
+ },
+
+ {
{"allow_system_table_mods", PGC_SUSET, DEVELOPER_OPTIONS,
gettext_noop("Allows modifications of the structure of system tables."),
NULL,
@@ -6267,6 +6281,9 @@ BeginReportingGUCOptions(void)
ReportGUCOption(conf);
}
+ /* Hack for in_hot_standby: remember the value we just sent */
+ last_reported_in_hot_standby = in_hot_standby;
+
report_needed = false;
}
@@ -6290,6 +6307,23 @@ ReportChangedGUCOptions(void)
if (!reporting_enabled)
return;
+ /*
+ * Since in_hot_standby isn't actually changed by normal GUC actions, we
+ * need a hack to check whether a new value needs to be reported to the
+ * client. For speed, we rely on the assumption that it can never
+ * transition from false to true.
+ */
+ if (last_reported_in_hot_standby && !RecoveryInProgress())
+ {
+ struct config_generic *record;
+
+ record = find_option("in_hot_standby", false, ERROR);
+ Assert(record != NULL);
+ record->status |= GUC_NEEDS_REPORT;
+ report_needed = true;
+ last_reported_in_hot_standby = false;
+ }
+
/* Quick exit if no values have been changed */
if (!report_needed)
return;
@@ -11783,6 +11817,18 @@ show_data_directory_mode(void)
return buf;
}
+static const char *
+show_in_hot_standby(void)
+{
+ /*
+ * Unlike most show hooks, this has a side-effect of updating the dummy
+ * GUC variable to contain the value last shown. See confederate code in
+ * BeginReportingGUCOptions and ReportChangedGUCOptions.
+ */
+ in_hot_standby = RecoveryInProgress();
+ return in_hot_standby ? "on" : "off";
+}
+
/*
* We split the input string, where commas separate function names
* and certain whitespace chars are ignored, into a \0-separated (and
--
1.8.3.1
v20-0002-Enhance-libpq-target_session_attrs.patchapplication/octet-stream; name=v20-0002-Enhance-libpq-target_session_attrs.patchDownload
From 1889e15035d1e3eadf5372b0255b24c6a22b3f5b Mon Sep 17 00:00:00 2001
From: Greg Nancarrow <gregn4422@gmail.com>
Date: Tue, 1 Dec 2020 15:27:28 +1100
Subject: [PATCH v20 2/2] Enhance the libpq "target_session_attrs" connection
parameter.
Enhance the connection parameter "target_session_attrs" to support new values:
read-only/primary/standby/prefer-standby.
Add a new "read-only" target_session_attrs option value, to support connecting
to a read-only server if available from the list of hosts (otherwise the
connection attempt fails).
Add a new "primary" target_session_attrs option value, to support connecting to
a server which is not in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "standby" target_session_attrs option value, to support connecting to
a server which is in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "prefer-standby" target_session_attrs option value, to support
connecting to a server which is in hot-standby mode, if available from the list
of hosts (otherwise connect to a server which is not in hot-standby mode).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/libpq.sgml | 76 ++++-
doc/src/sgml/protocol.sgml | 9 +-
src/interfaces/libpq/fe-connect.c | 505 ++++++++++++++++++++++++++++++----
src/interfaces/libpq/fe-exec.c | 18 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 50 +++-
src/test/recovery/t/001_stream_rep.pl | 79 +++++-
7 files changed, 650 insertions(+), 90 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 1553f9c..b062086 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1817,18 +1817,61 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ The support of read-write transactions is determined by the value of the
+ <varname>transaction_read_only</varname> configuration parameter, that is
+ either reported by the server (if supported) upon successful connection or
+ is otherwise explicitly queried by sending
+ <literal>SHOW transaction_read_only</literal> after successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support
+ read-write transactions. The standby mode state is determined by either the
+ value of the <varname>in_hot_standby</varname> configuration parameter, that is
+ reported by the server (if supported) upon successful connection, or is
+ otherwise explicitly queried by sending
+ <literal>SELECT pg_is_in_recovery()</literal> after successful connection; if
+ it returns <literal>t</literal>, it means the server is in hot standby mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-write</literal>, only a connection in
+ which read-write transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection in
+ which read-only transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then only a connection in
+ which the server is not in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, then only a connection in
+ which the server is in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, then a connection
+ in which the server is in hot standby mode is preferred. Otherwise, if no such
+ connections can be found, then a connection in which the server is not in hot
+ standby mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
</variablelist>
</para>
</sect2>
@@ -2136,14 +2179,18 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname>, and
+ <varname>in_hot_standby</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> and <varname>in_hot_standby</varname>
+ were not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
@@ -7245,6 +7292,7 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index cee2888..e45dfc6 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1283,14 +1283,17 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>DateStyle</varname>,
<varname>IntervalStyle</varname>,
<varname>TimeZone</varname>,
- <varname>integer_datetimes</varname>, and
- <varname>standard_conforming_strings</varname>.
+ <varname>integer_datetimes</varname>,
+ <varname>standard_conforming_strings</varname>,
+ <varname>transaction_read_only</varname>, and
+ <varname>in_hot_standby</varname>.
(<varname>server_encoding</varname>, <varname>TimeZone</varname>, and
<varname>integer_datetimes</varname> were not reported by releases before 8.0;
<varname>standard_conforming_strings</varname> was not reported by releases
before 8.1;
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
- <varname>application_name</varname> was not reported by releases before 9.0.)
+ <varname>application_name</varname> was not reported by releases before 9.0;
+ <varname>transaction_read_only</varname> and <varname>in_hot_standby</varname> were not reported by releases before 14.0.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 7d04d36..eb7457a 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -352,7 +352,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1002,6 +1002,33 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndGetTargetServerType
+ *
+ * Validate a given target_session_attrs value and get the requested server type.
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "any") == 0)
+ *requestedServerType = SERVER_TYPE_ANY;
+ else if (strcmp(optionValue, "primary") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_READ_WRITE;
+ else if (strcmp(optionValue, "read-only") == 0)
+ *requestedServerType = SERVER_TYPE_READ_ONLY;
+ else if (strcmp(optionValue, "prefer-standby") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1397,13 +1424,12 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (!validateAndGetTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
printfPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid %s value: \"%s\"\n"),
- "target_settion_attrs",
+ "target_session_attrs",
conn->target_session_attrs);
return false;
}
@@ -2229,6 +2255,102 @@ restoreErrorMessage(PGconn *conn, PQExpBuffer savedMessage)
termPQExpBuffer(savedMessage);
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (read-write/read-only).
+ * The connection state is set to try the next host (if any).
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for standby). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the primary host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedStandbyConnection(PGconn *conn)
+{
+ /* Not a requested type; fail this connection. */
+ const char *displayed_host;
+ const char *displayed_port;
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in hot standby mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in hot standby mode "
+ "\"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2311,6 +2433,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_STANDBY:
break;
default:
@@ -2347,13 +2470,33 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_primary_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to standby servers
+ * and should now try to re-connect to a previously-connected-to
+ * primary server, whose host index is recorded in
+ * which_primary_host.
+ */
+ conn->whichhost = conn->which_primary_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_primary_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3560,39 +3703,183 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
- /*
- * If a read-write connection is required, see if we have one.
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but
- * by the same token they don't have any read-only mode, so we
- * may just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if (conn->requested_server_type != SERVER_TYPE_ANY)
{
/*
- * Save existing error messages across the PQsendQuery
- * attempt. This is necessary because PQsendQuery is
- * going to reset conn->errorMessage, so we would lose
- * error messages related to previous hosts we have tried
- * and failed to connect to.
+ * If a read-write or read-only connection is required, see if
+ * we have one.
+ *
+ * Servers before 7.4 lack the transaction_read_only GUC, but
+ * by the same token they don't have any read-only mode, so we
+ * may just skip the test in that case.
*/
- if (!saveErrorMessage(conn, &savedMessage))
- goto error_return;
+ if (conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * For servers which don't have "transaction_read_only" as
+ * a GUC_REPORT variable, it in necessary to determine if
+ * they are read-only by sending the query
+ * "SHOW transaction_read_only".
+ */
+ if (conn->transaction_read_only == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * Save existing error messages across the
+ * PQsendQuery attempt. This is necessary because
+ * PQsendQuery is going to reset
+ * conn->errorMessage, so we would lose error
+ * messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SHOW transaction_read_only"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->transaction_read_only == GUC_BOOL_YES &&
+ conn->requested_server_type == SERVER_TYPE_READ_WRITE) ||
+ (conn->transaction_read_only == GUC_BOOL_NO &&
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * Server is read-only but requested read-write,
+ * or server is read-write but requested
+ * read-only, reject and continue to process any
+ * further hosts ...
+ */
+
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
- conn->status = CONNECTION_OK;
- if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Servers before 9.0 don't support standby mode, skip the
+ * check when the requested type of connection is primary,
+ * prefer-standby or standby.
+ */
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
{
- restoreErrorMessage(conn, &savedMessage);
- goto error_return;
+ /*
+ * For servers which don't have the "in_hot_standby" GUC_REPORT
+ * variable, it in necessary to determine if they are in hot
+ * standby mode by sending the query "SELECT pg_is_in_recovery()".
+ */
+ if (conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * Save existing error messages across the PQsendQuery
+ * attempt. This is necessary because PQsendQuery is
+ * going to reset conn->errorMessage, so we would lose
+ * error messages related to previous hosts we have
+ * tried and failed to connect to.
+ */
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQsendQuery(conn, "SELECT pg_is_in_recovery()"))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ conn->status = CONNECTION_CHECK_STANDBY;
+
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_hot_standby == GUC_BOOL_YES &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (conn->in_hot_standby == GUC_BOOL_NO &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in standby but requested primary, or
+ * server is not in standby but requested
+ * prefer-standby/standby, reject and continue to
+ * process any further hosts ...
+ */
+
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of
+ * the list of connections, as it couldn't find
+ * any servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ rejectCheckedStandbyConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the requested type is prefer-standby, then record this host
+ * index and try any others before considering it later. If the
+ * requested type of connection is read-only or standby, ignore
+ * this connection, as servers of this version don't support this
+ * type of connection.
+ */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the prefer-standby
+ * type for the next pass of the list of connections, as
+ * it couldn't find any servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
+ {
+ if (conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+ }
+
+ /*
+ * Try the next host, if any, but we don't want to consider
+ * additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
}
- conn->status = CONNECTION_CHECK_WRITABLE;
- restoreErrorMessage(conn, &savedMessage);
- return PGRES_POLLING_READING;
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3664,6 +3951,7 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
return PGRES_POLLING_OK;
}
+
case CONNECTION_CHECK_WRITABLE:
{
const char *displayed_host;
@@ -3691,42 +3979,135 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is read-write,
+ * ignore this connection. Server is read-write and requested
+ * type is read-only, ignore this connection.
+ */
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_ONLY)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
PQclear(res);
restoreErrorMessage(conn, &savedMessage);
- /* Append error report to conn->errorMessage. */
- if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
- displayed_host = conn->connhost[conn->whichhost].hostaddr;
- else
- displayed_host = conn->connhost[conn->whichhost].host;
- displayed_port = conn->connhost[conn->whichhost].port;
- if (displayed_port == NULL || displayed_port[0] == '\0')
- displayed_port = DEF_PGPORT_STR;
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
- appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("could not make a writable "
- "connection to server "
- "\"%s:%s\"\n"),
- displayed_host, displayed_port);
+ /* Session is requested type, so we're good. */
+ PQclear(res);
+ termPQExpBuffer(&savedMessage);
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
+
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ /* Append error report to conn->errorMessage. */
+ if (conn->connhost[conn->whichhost].type == CHT_HOST_ADDRESS)
+ displayed_host = conn->connhost[conn->whichhost].hostaddr;
+ else
+ displayed_host = conn->connhost[conn->whichhost].host;
+ displayed_port = conn->connhost[conn->whichhost].port;
+ if (displayed_port == NULL || displayed_port[0] == '\0')
+ displayed_port = DEF_PGPORT_STR;
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ "on server \"%s:%s\"\n"),
+ displayed_host, displayed_port);
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_STANDBY:
+ {
+ const char *displayed_host;
+ const char *displayed_port;
+
+ if (!saveErrorMessage(conn, &savedMessage))
+ goto error_return;
+
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ restoreErrorMessage(conn, &savedMessage);
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_STANDBY;
+ restoreErrorMessage(conn, &savedMessage);
+ return PGRES_POLLING_READING;
+ }
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in standby mode and requested mode is
+ * primary, ignore it. Server is not in standby mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers that
+ * are in standby mode.
*/
- conn->try_next_host = true;
+ if (conn->which_primary_host == -2)
+ goto consume_checked_standby_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+ restoreErrorMessage(conn, &savedMessage);
+
+ rejectCheckedStandbyConnection(conn);
goto keep_going;
}
- /* Session is read-write, so we're good. */
+ consume_checked_standby_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
termPQExpBuffer(&savedMessage);
@@ -3739,7 +4120,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3755,7 +4136,7 @@ keep_going: /* We will come back to here until there is
if (displayed_port == NULL || displayed_port[0] == '\0')
displayed_port = DEF_PGPORT_STR;
appendPQExpBuffer(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed "
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed "
"on server \"%s:%s\"\n"),
displayed_host, displayed_port);
@@ -3767,7 +4148,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3911,10 +4291,15 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->transaction_read_only = GUC_BOOL_UNKNOWN;
+ conn->in_hot_standby = GUC_BOOL_UNKNOWN;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_primary_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index eea0237..ff92ee2 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1058,11 +1058,11 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
- * standard_conforming_strings, and convert server version to a numeric
- * form. We keep the first two of these in static variables as well, so
- * that PQescapeString and PQescapeBytea can behave somewhat sanely (at
- * least in single-connection-using programs).
+ * Special hacks: remember client_encoding, transaction_read_only,
+ * in_hot_standby and standard_conforming_strings, and convert server
+ * version to a numeric form. We keep the first two of these in static
+ * variables as well, so that PQescapeString and PQescapeBytea can
+ * behave somewhat sanely (at least in single-connection-using programs).
*/
if (strcmp(name, "client_encoding") == 0)
{
@@ -1112,6 +1112,14 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "transaction_read_only") == 0)
+ {
+ conn->transaction_read_only = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
+ else if (strcmp(name, "in_hot_standby") == 0)
+ {
+ conn->in_hot_standby = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 3b6a9fb..832035c 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_STANDBY /* Check whether server is in standby mode */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 1de91ae..a0a73c8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,29 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_READ_WRITE, /* Read-write server */
+ SERVER_TYPE_READ_ONLY, /* Read-only server */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
+/*
+ * State of certain bool GUCs used by libpq, which are determined
+ * either by the GUC_REPORT mechanism (where supported by the server
+ * version) or by lazy evaluation (using a query sent to the server).
+ */
+typedef enum
+{
+ GUC_BOOL_UNKNOWN = 0, /* Currently unknown */
+ GUC_BOOL_YES, /* Yes (true) */
+ GUC_BOOL_NO /* No (false) */
+} GucBoolState;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +393,17 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: "any", "read-write",
+ * "read-only", "primary", "prefer-standby", "standby".
+ */
char *target_session_attrs;
+ /*
+ * The requested server type, derived from target_session_attrs.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +437,21 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first primary host encountered (if any) in the connection
+ * string. This is used during processing of requested server connection type
+ * SERVER_TYPE_PREFER_STANDBY.
+ *
+ * The initial value is -1, indicating that no primary host has yet been
+ * found. It is then set to the index of the first primary host, if one is
+ * found in the connection string during processing. If a second
+ * connection attempt is later made to that primary host (because no
+ * connection to a standby server could be made), which_primary_host
+ * is then set to -2 to avoid recursion during subsequent processing (and
+ * whichhost is set to the primary host index).
+ */
+ int which_primary_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +482,8 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ GucBoolState transaction_read_only; /* transaction_read_only GUC report variable state */
+ GucBoolState in_hot_standby; /* in_hot_standby GUC report variable state */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..15d0273 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 49;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -85,7 +85,7 @@ sub test_target_session_attrs
my $node2_port = $node2->port;
my $node2_name = $node2->name;
- my $target_name = $target_node->name;
+ my $target_name = $target_node->name if (defined $target_node);
# Build connection string for connection attempt.
my $connstr = "host=$node1_host,$node2_host ";
@@ -97,10 +97,25 @@ sub test_target_session_attrs
my ($ret, $stdout, $stderr) =
$node1->psql('postgres', 'SHOW port;',
extra_params => [ '-d', $connstr ]);
- is( $status == $ret && $stdout eq $target_node->port,
- 1,
- "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
- );
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
return;
}
@@ -121,6 +136,58 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to standby1 in "read-only" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "read-only", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Fail to connect in "read-write" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "read-write", 2);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "read-only" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "read-only", 2);
+
+# Fail to connect in "standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "standby", 2);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
--
1.8.3.1
Greg Nancarrow <gregn4422@gmail.com> writes:
Posting an updated set of patches.
I've reviewed and pushed most of v20-0001, with the following changes:
* I realized that we had more moving parts than necessary for
in_hot_standby. We don't really need two static variables, one is
sufficient --- and we shouldn't make the SHOW hook have side-effects,
that's just dangerous.
* The documentation patches were missing an addition to config.sgml,
as well as failing to list the new GUC in the two places where we
document all GUC_REPORT variables.
What I did *not* push was the change to mark transaction_read_only
as GUC_REPORT. I find that idea highly dubious, for a couple of
reasons:
* It'll create useless ParameterStatus traffic during normal operations
of an application using "START TRANSACTION READ ONLY" or the like.
* I do not think it will actually work for the desired purpose of
finding out the read-only state during session connection. AFAICS,
we don't set XactReadOnly to a meaningful value except when starting
a transaction. Yeah, we'll run a transaction during login because
we have to examine the system catalogs ... but do we start a new
one after absorbing the effects of, say, ALTER USER SET
default_transaction_read_only? I doubt it, and even if it works
today it'd be fragile, because someday somebody will try to collapse
any multiple transactions during startup into one transaction.
I think what we want to do is mark default_transaction_read_only as
GUC_REPORT, instead. That will give a reliable report of what the
state of its GUC is, and you can combine it with is_hot_standby
to decide whether the session should be considered read-only.
If you don't get those two GUC values during connection, then you
can fall back on "SHOW transaction_read_only".
Setting this back to waiting on author.
regards, tom lane
On Wed, Jan 6, 2021 at 3:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Greg Nancarrow <gregn4422@gmail.com> writes:
Posting an updated set of patches.
I've reviewed and pushed most of v20-0001, with the following changes:
* I realized that we had more moving parts than necessary for
in_hot_standby. We don't really need two static variables, one is
sufficient --- and we shouldn't make the SHOW hook have side-effects,
that's just dangerous.* The documentation patches were missing an addition to config.sgml,
as well as failing to list the new GUC in the two places where we
document all GUC_REPORT variables.What I did *not* push was the change to mark transaction_read_only
as GUC_REPORT. I find that idea highly dubious, for a couple of
reasons:* It'll create useless ParameterStatus traffic during normal operations
of an application using "START TRANSACTION READ ONLY" or the like.* I do not think it will actually work for the desired purpose of
finding out the read-only state during session connection. AFAICS,
we don't set XactReadOnly to a meaningful value except when starting
a transaction. Yeah, we'll run a transaction during login because
we have to examine the system catalogs ... but do we start a new
one after absorbing the effects of, say, ALTER USER SET
default_transaction_read_only? I doubt it, and even if it works
today it'd be fragile, because someday somebody will try to collapse
any multiple transactions during startup into one transaction.I think what we want to do is mark default_transaction_read_only as
GUC_REPORT, instead. That will give a reliable report of what the
state of its GUC is, and you can combine it with is_hot_standby
to decide whether the session should be considered read-only.
If you don't get those two GUC values during connection, then you
can fall back on "SHOW transaction_read_only".
I have made a patch for the above with the changes suggested and
rebased it with the head code.
Attached v21 patch which has the changes for the same.
Thoughts?
Regards,
Vignesh
Attachments:
v21-0001-Enhance-the-libpq-target_session_attrs-connectio.patchapplication/x-patch; name=v21-0001-Enhance-the-libpq-target_session_attrs-connectio.patchDownload
From 9d85abfe1e4b43d67ee746891830abe53077c0e7 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 8 Feb 2021 11:23:31 +0530
Subject: [PATCH v21] Enhance the libpq "target_session_attrs" connection
parameter.
Enhance the connection parameter "target_session_attrs" to support new values:
read-only/primary/standby/prefer-standby.
Add a new "read-only" target_session_attrs option value, to support connecting
to a read-only server if available from the list of hosts (otherwise the
connection attempt fails).
Add a new "primary" target_session_attrs option value, to support connecting to
a server which is not in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "standby" target_session_attrs option value, to support connecting to
a server which is in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "prefer-standby" target_session_attrs option value, to support
connecting to a server which is in hot-standby mode, if available from the list
of hosts (otherwise connect to a server which is not in hot-standby mode).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/high-availability.sgml | 16 +-
doc/src/sgml/libpq.sgml | 70 +++++-
doc/src/sgml/protocol.sgml | 5 +-
src/backend/utils/misc/guc.c | 3 +-
src/interfaces/libpq/fe-connect.c | 432 ++++++++++++++++++++++++++++++----
src/interfaces/libpq/fe-exec.c | 18 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 52 +++-
src/test/recovery/t/001_stream_rep.pl | 79 ++++++-
src/tools/pgindent/typedefs.list | 2 +
10 files changed, 599 insertions(+), 81 deletions(-)
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index f49f5c0..a454e93 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1700,14 +1700,14 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
- the database, connections during hot standby will act much like any other
- database connection. If failover or switchover occurs, the database will
- switch to normal processing mode. Sessions will remain connected while the
- server changes mode. Once hot standby finishes, it will be possible to
- initiate read-write transactions (even from a session begun during
- hot standby).
+ During hot standby, the parameter <varname>in_hot_standby</varname> and
+ <varname>default_transaction_read_only</varname> are always true and may
+ not be changed. But as long as no attempt is made to modify the database,
+ connections during hot standby will act much like any other database
+ connection. If failover or switchover occurs, the database will switch to
+ normal processing mode. Sessions will remain connected while the server
+ changes mode. Once hot standby finishes, it will be possible to initiate
+ read-write transactions (even from a session begun during hot standby).
</para>
<para>
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index b7a8245..12bacda 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1836,18 +1836,63 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ The support of read-write transactions is determined by the value of the
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> configuration parameter, that is
+ either reported by the server (if supported) upon successful connection
+ or is otherwise explicitly queried by sending
+ <literal>SHOW transaction_read_only</literal> after successful connection; if
+ it returns <literal>on</literal>, it means the server doesn't support
+ read-write transactions. The standby mode state is determined by either
+ the value of the <varname>in_hot_standby</varname> configuration
+ parameter, that is reported by the server (if supported) upon
+ successful connection, or is otherwise explicitly queried by sending
+ <literal>SELECT pg_is_in_recovery()</literal> after successful
+ connection; if it returns <literal>t</literal>, it means the server is
+ in hot standby mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-write</literal>, only a connection in
+ which read-write transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection in
+ which read-only transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then only a connection in
+ which the server is not in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, then only a connection in
+ which the server is in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, then a connection
+ in which the server is in hot standby mode is preferred. Otherwise, if no such
+ connections can be found, then a connection in which the server is not in hot
+ standby mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
</variablelist>
</para>
</sect2>
@@ -2150,6 +2195,7 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>server_encoding</varname>,
<varname>client_encoding</varname>,
<varname>application_name</varname>,
+ <varname>default_transaction_read_only</varname>,
<varname>in_hot_standby</varname>,
<varname>is_superuser</varname>,
<varname>session_authorization</varname>,
@@ -2165,6 +2211,7 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before
9.0;
+ <varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> was not reported by releases before
14.)
Note that
@@ -7268,6 +7315,7 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3763b4b..b24225c 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1278,6 +1278,7 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>server_encoding</varname>,
<varname>client_encoding</varname>,
<varname>application_name</varname>,
+ <varname>default_transaction_read_only</varname>,
<varname>in_hot_standby</varname>,
<varname>is_superuser</varname>,
<varname>session_authorization</varname>,
@@ -1293,8 +1294,8 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before
9.0;
- <varname>in_hot_standby</varname> was not reported by releases before
- 14.)
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> were not reported by releases before 14.)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index eafdb11..3dba213 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1619,7 +1619,8 @@ static struct config_bool ConfigureNamesBool[] =
{
{"default_transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the default read-only status of new transactions."),
- NULL
+ NULL,
+ GUC_REPORT
},
&DefaultXactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8ca0583..4e6aba8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -352,7 +352,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1006,6 +1006,33 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndGetTargetServerType
+ *
+ * Validate a given target_session_attrs value and get the requested server type.
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "any") == 0)
+ *requestedServerType = SERVER_TYPE_ANY;
+ else if (strcmp(optionValue, "primary") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_READ_WRITE;
+ else if (strcmp(optionValue, "read-only") == 0)
+ *requestedServerType = SERVER_TYPE_READ_ONLY;
+ else if (strcmp(optionValue, "prefer-standby") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1401,13 +1428,12 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (!validateAndGetTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid %s value: \"%s\"\n"),
- "target_settion_attrs",
+ "target_session_attrs",
conn->target_session_attrs);
return false;
}
@@ -2192,6 +2218,68 @@ connectDBComplete(PGconn *conn)
}
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (read-write/read-only).
+ * The connection state is set to try the next host (if any).
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ if (conn->requested_server_type == SERVER_TYPE_READ_WRITE)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server\n"));
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for standby). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the primary host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedStandbyConnection(PGconn *conn)
+{
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is in hot standby mode\n"));
+ else
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("server is not in hot standby mode\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2273,6 +2361,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_STANDBY:
break;
default:
@@ -2309,13 +2398,33 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_primary_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to standby servers
+ * and should now try to re-connect to a
+ * previously-connected-to primary server, whose host index is
+ * recorded in which_primary_host.
+ */
+ conn->whichhost = conn->which_primary_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_primary_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3545,31 +3654,175 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
- /*
- * If a read-write connection is required, see if we have one.
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but
- * by the same token they don't have any read-only mode, so we
- * may just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if (conn->requested_server_type != SERVER_TYPE_ANY)
{
/*
- * We use PQsendQueryContinue so that conn->errorMessage
- * does not get cleared. We need to preserve any error
- * messages related to previous hosts we have tried and
- * failed to connect to.
+ * If a read-write or read-only connection is required,
+ * see if we have one.
+ *
+ * Servers before 7.4 lack the
+ * default_transaction_read_only & in_hot_standby GUC, but
+ * by the same token they don't have any read-only mode,
+ * so we may just skip the test in that case.
*/
- conn->status = CONNECTION_OK;
- if (!PQsendQueryContinue(conn,
- "SHOW transaction_read_only"))
- goto error_return;
- conn->status = CONNECTION_CHECK_WRITABLE;
- return PGRES_POLLING_READING;
+
+ if (conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * For servers which don't have
+ * "default_transaction_read_only" or "in_hot_standby"
+ * as a GUC_REPORT variable, it in necessary to
+ * determine if they are read-only by sending the
+ * query "SHOW transaction_read_only".
+ */
+ if (conn->default_transaction_read_only == GUC_BOOL_UNKNOWN ||
+ conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * We use PQsendQueryContinue so that
+ * conn->errorMessage does not get cleared. We
+ * need to preserve any error messages related to
+ * previous hosts we have tried and failed to
+ * connect to.
+ */
+ conn->status = CONNECTION_OK;
+ if (!PQsendQueryContinue(conn, "SHOW transaction_read_only"))
+ goto error_return;
+
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ return PGRES_POLLING_READING;
+ }
+ else if (((conn->default_transaction_read_only == GUC_BOOL_YES ||
+ conn->in_hot_standby == GUC_BOOL_YES) &&
+ conn->requested_server_type == SERVER_TYPE_READ_WRITE) ||
+ (conn->default_transaction_read_only == GUC_BOOL_NO &&
+ conn->in_hot_standby == GUC_BOOL_NO &&
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * Server is read-only but requested read-write,
+ * or server is read-write but requested
+ * read-only, reject and continue to process any
+ * further hosts ...
+ */
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Servers before 9.0 don't support standby mode, skip the
+ * check when the requested type of connection is primary,
+ * prefer-standby or standby.
+ */
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * For servers which don't have the "in_hot_standby"
+ * GUC_REPORT variable, it in necessary to determine
+ * if they are in hot standby mode by sending the
+ * query "SELECT pg_is_in_recovery()".
+ */
+ if (conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * We use PQsendQueryContinue so that
+ * conn->errorMessage does not get cleared. We
+ * need to preserve any error messages related to
+ * previous hosts we have tried and failed to
+ * connect to.
+ */
+ conn->status = CONNECTION_OK;
+ if (!PQsendQueryContinue(conn, "SELECT pg_is_in_recovery()"))
+ goto error_return;
+
+ conn->status = CONNECTION_CHECK_STANDBY;
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_hot_standby == GUC_BOOL_YES &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (conn->in_hot_standby == GUC_BOOL_NO &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in standby but requested primary, or
+ * server is not in standby but requested
+ * prefer-standby/standby, reject and continue to
+ * process any further hosts ...
+ */
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of
+ * the list of connections, as it couldn't
+ * find any servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ rejectCheckedStandbyConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the requested type is prefer-standby, then record
+ * this host index and try any others before considering
+ * it later. If the requested type of connection is
+ * read-only or standby, ignore this connection, as
+ * servers of this version don't support this type of
+ * connection.
+ */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the
+ * list of connections, as it couldn't find any
+ * servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
+ {
+ if (conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+ }
+
+ /*
+ * Try the next host, if any, but we don't want to
+ * consider additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3641,6 +3894,7 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
return PGRES_POLLING_OK;
}
+
case CONNECTION_CHECK_WRITABLE:
{
conn->status = CONNECTION_OK;
@@ -3658,30 +3912,112 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is
+ * read-write, ignore this connection. Server is
+ * read-write and requested type is read-only, ignore this
+ * connection.
+ */
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_ONLY)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
PQclear(res);
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+ /* Session is requested type, so we're good. */
+ PQclear(res);
- /* Append error report to conn->errorMessage. */
- appendPQExpBufferStr(&conn->errorMessage,
- libpq_gettext("session is read-only\n"));
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+
+ /* Append error report to conn->errorMessage. */
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_STANDBY:
+ {
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ goto error_return;
+ }
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_STANDBY;
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in standby mode and requested mode is
+ * primary, ignore it. Server is not in standby mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers
+ * that are in standby mode.
*/
- conn->try_next_host = true;
+ if (conn->which_primary_host == -2)
+ goto consume_checked_standby_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+
+ rejectCheckedStandbyConnection(conn);
goto keep_going;
}
-
- /* Session is read-write, so we're good. */
+ consume_checked_standby_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
/*
@@ -3693,7 +4029,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3701,7 +4037,7 @@ keep_going: /* We will come back to here until there is
/* Append error report to conn->errorMessage. */
appendPQExpBufferStr(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed\n"));
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed\n"));
/* Close connection politely. */
conn->status = CONNECTION_OK;
@@ -3711,7 +4047,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3855,10 +4190,15 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->default_transaction_read_only = GUC_BOOL_UNKNOWN;
+ conn->in_hot_standby = GUC_BOOL_UNKNOWN;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_primary_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index e730753..60cd4a7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1008,11 +1008,11 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
- * standard_conforming_strings, and convert server version to a numeric
- * form. We keep the first two of these in static variables as well, so
- * that PQescapeString and PQescapeBytea can behave somewhat sanely (at
- * least in single-connection-using programs).
+ * Special hacks: remember client_encoding, default_transaction_read_only,
+ * in_hot_standby and standard_conforming_strings, and convert server
+ * version to a numeric form. We keep the first two of these in static
+ * variables as well, so that PQescapeString and PQescapeBytea can
+ * behave somewhat sanely (at least in single-connection-using programs).
*/
if (strcmp(name, "client_encoding") == 0)
{
@@ -1062,6 +1062,14 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "default_transaction_read_only") == 0)
+ {
+ conn->default_transaction_read_only = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
+ else if (strcmp(name, "in_hot_standby") == 0)
+ {
+ conn->in_hot_standby = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index c266ad5..5c1b349 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_STANDBY /* Check whether server is in standby mode */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4db4983..0ba81ae 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,29 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_READ_WRITE, /* Read-write server */
+ SERVER_TYPE_READ_ONLY, /* Read-only server */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
+/*
+ * State of certain bool GUCs used by libpq, which are determined
+ * either by the GUC_REPORT mechanism (where supported by the server
+ * version) or by lazy evaluation (using a query sent to the server).
+ */
+typedef enum
+{
+ GUC_BOOL_UNKNOWN = 0, /* Currently unknown */
+ GUC_BOOL_YES, /* Yes (true) */
+ GUC_BOOL_NO /* No (false) */
+} GucBoolState;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +393,17 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: "any", "read-write",
+ * "read-only", "primary", "prefer-standby", "standby".
+ */
char *target_session_attrs;
+ /*
+ * The requested server type, derived from target_session_attrs.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +437,21 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first primary host encountered (if any) in the connection
+ * string. This is used during processing of requested server connection
+ * type SERVER_TYPE_PREFER_STANDBY.
+ *
+ * The initial value is -1, indicating that no primary host has yet been
+ * found. It is then set to the index of the first primary host, if one is
+ * found in the connection string during processing. If a second
+ * connection attempt is later made to that primary host (because no
+ * connection to a standby server could be made), which_primary_host is
+ * then set to -2 to avoid recursion during subsequent processing (and
+ * whichhost is set to the primary host index).
+ */
+ int which_primary_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +482,10 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ GucBoolState default_transaction_read_only; /* default_transaction_read_only
+ * GUC report variable state */
+ GucBoolState in_hot_standby; /* in_hot_standby GUC report variable
+ * state */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..15d0273 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 49;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -85,7 +85,7 @@ sub test_target_session_attrs
my $node2_port = $node2->port;
my $node2_name = $node2->name;
- my $target_name = $target_node->name;
+ my $target_name = $target_node->name if (defined $target_node);
# Build connection string for connection attempt.
my $connstr = "host=$node1_host,$node2_host ";
@@ -97,10 +97,25 @@ sub test_target_session_attrs
my ($ret, $stdout, $stderr) =
$node1->psql('postgres', 'SHOW port;',
extra_params => [ '-d', $connstr ]);
- is( $status == $ret && $stdout eq $target_node->port,
- 1,
- "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
- );
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
return;
}
@@ -121,6 +136,58 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to standby1 in "read-only" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "read-only", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Fail to connect in "read-write" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "read-write", 2);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "read-only" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "read-only", 2);
+
+# Fail to connect in "standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "standby", 2);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1d540fe..4a9cc23 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -950,6 +950,7 @@ GroupingSetsPath
GucAction
GucBoolAssignHook
GucBoolCheckHook
+GucBoolState
GucContext
GucEnumAssignHook
GucEnumCheckHook
@@ -2512,6 +2513,7 @@ TapeShare
TarMethodData
TarMethodFile
TargetEntry
+TargetServerType
TclExceptionNameMap
Tcl_DString
Tcl_FileProc
--
1.8.3.1
On Mon, Feb 8, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
I think what we want to do is mark default_transaction_read_only as
GUC_REPORT, instead. That will give a reliable report of what the
state of its GUC is, and you can combine it with is_hot_standby
to decide whether the session should be considered read-only.
If you don't get those two GUC values during connection, then you
can fall back on "SHOW transaction_read_only".I have made a patch for the above with the changes suggested and
rebased it with the head code.
Attached v21 patch which has the changes for the same.
Thoughts?
I'm still looking at the patch code, but I noticed that the
documentation update describing how support of read-write transactions
is determined isn't quite right and it isn't clear how the parameters
work.
I'd suggest something like the following (you'd need to fix the line
lengths and line-wrapping appropriately) - please check it for
correctness:
<para>
The support of read-write transactions is determined by the value of the
<varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> configuration parameters,
that, if supported,
are reported by the server upon successful connection. If the
value of either
of these parameters is <literal>on</literal>, it means the
server doesn't support
read-write transactions. If either/both of these parameters
are not reported,
then the support of read-write transactions is determined by
an explicit query,
by sending <literal>SHOW transaction_read_only</literal> after
successful
connection; if it returns <literal>on</literal>, it means the
server doesn't
support read-write transactions. The standby mode state is
determined by either
the value of the <varname>in_hot_standby</varname> configuration
parameter, that is reported by the server (if supported) upon
successful connection, or is otherwise explicitly queried by sending
<literal>SELECT pg_is_in_recovery()</literal> after successful
connection; if it returns <literal>t</literal>, it means the server is
in hot standby mode.
</para>
Regards,
Greg Nancarrow
Fujitsu Australia
On Mon, Feb 8, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
I think what we want to do is mark default_transaction_read_only as
GUC_REPORT, instead. That will give a reliable report of what the
state of its GUC is, and you can combine it with is_hot_standby
to decide whether the session should be considered read-only.
If you don't get those two GUC values during connection, then you
can fall back on "SHOW transaction_read_only".I have made a patch for the above with the changes suggested and
rebased it with the head code.
Attached v21 patch which has the changes for the same.
Thoughts?
Further to my other doc change feedback, I can only spot the following
minor things (otherwise the changes that you have made seek OK to me).
1) doc/src/sgml/protocol.sgml
<varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> were not reported by releases before
14.)
should be:
<varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> were not reported by releases before
14.0)
2) doc/src/sgml/high-availability,sgml
<para>
During hot standby, the parameter <varname>in_hot_standby</varname> and
<varname>default_transaction_read_only</varname> are always true and may
not be changed.
should be:
<para>
During hot standby, the parameters <varname>in_hot_standby</varname> and
<varname>transaction_read_only</varname> are always true and may
not be changed.
[I believe that there's only checks on attempts to change
"transaction_read_only" when in hot_standby, not
"default_transaction_read_only"; see check_transaction_read_only()]
3) src/interfaces/libpq/fe-connect.c
In rejectCheckedReadOrWriteConnection() and
rejectCheckedStandbyConnection(), now that host and port info are
emitted separately and are not included in each error message string
(as parameters in a format string), I think those functions should use
appendPQExpBufferStr() instead of appendPQExpBuffer(), as it's more
efficient if there is just a single string argument.
Regards,
Greg Nancarrow
Fujitsu Australia
On Tue, Feb 9, 2021 at 5:47 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Feb 8, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
I think what we want to do is mark default_transaction_read_only as
GUC_REPORT, instead. That will give a reliable report of what the
state of its GUC is, and you can combine it with is_hot_standby
to decide whether the session should be considered read-only.
If you don't get those two GUC values during connection, then you
can fall back on "SHOW transaction_read_only".I have made a patch for the above with the changes suggested and
rebased it with the head code.
Attached v21 patch which has the changes for the same.
Thoughts?I'm still looking at the patch code, but I noticed that the
documentation update describing how support of read-write transactions
is determined isn't quite right and it isn't clear how the parameters
work.
I'd suggest something like the following (you'd need to fix the line
lengths and line-wrapping appropriately) - please check it for
correctness:<para>
The support of read-write transactions is determined by the value of the
<varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> configuration parameters,
that, if supported,
are reported by the server upon successful connection. If the
value of either
of these parameters is <literal>on</literal>, it means the
server doesn't support
read-write transactions. If either/both of these parameters
are not reported,
then the support of read-write transactions is determined by
an explicit query,
by sending <literal>SHOW transaction_read_only</literal> after
successful
connection; if it returns <literal>on</literal>, it means the
server doesn't
support read-write transactions. The standby mode state is
determined by either
the value of the <varname>in_hot_standby</varname> configuration
parameter, that is reported by the server (if supported) upon
successful connection, or is otherwise explicitly queried by sending
<literal>SELECT pg_is_in_recovery()</literal> after successful
connection; if it returns <literal>t</literal>, it means the server is
in hot standby mode.
</para>
Thanks Greg for the comments, Please find the attached v22 patch
having the fix for the same.
Thoughts?
Regards,
Vignesh
Attachments:
v22-0001-Enhance-the-libpq-target_session_attrs-connectio.patchtext/x-patch; charset=US-ASCII; name=v22-0001-Enhance-the-libpq-target_session_attrs-connectio.patchDownload
From db9b894d8a8c51de80e6b3b7b8c660dda5374ff9 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 8 Feb 2021 11:23:31 +0530
Subject: [PATCH v22] Enhance the libpq "target_session_attrs" connection
parameter.
Enhance the connection parameter "target_session_attrs" to support new values:
read-only/primary/standby/prefer-standby.
Add a new "read-only" target_session_attrs option value, to support connecting
to a read-only server if available from the list of hosts (otherwise the
connection attempt fails).
Add a new "primary" target_session_attrs option value, to support connecting to
a server which is not in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "standby" target_session_attrs option value, to support connecting to
a server which is in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "prefer-standby" target_session_attrs option value, to support
connecting to a server which is in hot-standby mode, if available from the list
of hosts (otherwise connect to a server which is not in hot-standby mode).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/high-availability.sgml | 16 +-
doc/src/sgml/libpq.sgml | 73 +++++-
doc/src/sgml/protocol.sgml | 5 +-
src/backend/utils/misc/guc.c | 3 +-
src/interfaces/libpq/fe-connect.c | 432 ++++++++++++++++++++++++++++++----
src/interfaces/libpq/fe-exec.c | 18 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 52 +++-
src/test/recovery/t/001_stream_rep.pl | 79 ++++++-
src/tools/pgindent/typedefs.list | 2 +
10 files changed, 602 insertions(+), 81 deletions(-)
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index f49f5c0..2bbd52c 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1700,14 +1700,14 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
- the database, connections during hot standby will act much like any other
- database connection. If failover or switchover occurs, the database will
- switch to normal processing mode. Sessions will remain connected while the
- server changes mode. Once hot standby finishes, it will be possible to
- initiate read-write transactions (even from a session begun during
- hot standby).
+ During hot standby, the parameter <varname>in_hot_standby</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify the database,
+ connections during hot standby will act much like any other database
+ connection. If failover or switchover occurs, the database will switch to
+ normal processing mode. Sessions will remain connected while the server
+ changes mode. Once hot standby finishes, it will be possible to initiate
+ read-write transactions (even from a session begun during hot standby).
</para>
<para>
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index b7a8245..d5f0f24 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1836,18 +1836,66 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ The support of read-write transactions is determined by the value of the
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> configuration parameters, that is
+ reported by the server (if supported) upon successful connection. If
+ the value of either of these parameters is <literal>on</literal>, it
+ means the server doesn't support read-write transactions. If
+ either/both of these parameters are not reported, then the support of
+ read-write transactions is determined by an explicit query, by sending
+ <literal>SHOW transaction_read_only</literal> after successful
+ connection; if it returns <literal>on</literal>, it means the server
+ doesn't support read-write transactions. The standby mode state is
+ determined by either the value of the <varname>in_hot_standby</varname>
+ configuration parameter, that is reported by the server (if supported)
+ upon successful connection, or is otherwise explicitly queried by
+ sending <literal>SELECT pg_is_in_recovery()</literal> after successful
+ connection; if it returns <literal>t</literal>, it means the server is
+ in hot standby mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-write</literal>, only a connection in
+ which read-write transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection in
+ which read-only transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then only a connection in
+ which the server is not in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, then only a connection in
+ which the server is in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, then a connection
+ in which the server is in hot standby mode is preferred. Otherwise, if no such
+ connections can be found, then a connection in which the server is not in hot
+ standby mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
</variablelist>
</para>
</sect2>
@@ -2150,6 +2198,7 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>server_encoding</varname>,
<varname>client_encoding</varname>,
<varname>application_name</varname>,
+ <varname>default_transaction_read_only</varname>,
<varname>in_hot_standby</varname>,
<varname>is_superuser</varname>,
<varname>session_authorization</varname>,
@@ -2165,6 +2214,7 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before
9.0;
+ <varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> was not reported by releases before
14.)
Note that
@@ -7268,6 +7318,7 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3763b4b..57d1f2b 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1278,6 +1278,7 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>server_encoding</varname>,
<varname>client_encoding</varname>,
<varname>application_name</varname>,
+ <varname>default_transaction_read_only</varname>,
<varname>in_hot_standby</varname>,
<varname>is_superuser</varname>,
<varname>session_authorization</varname>,
@@ -1293,8 +1294,8 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before
9.0;
- <varname>in_hot_standby</varname> was not reported by releases before
- 14.)
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> were not reported by releases before 14.0)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index eafdb11..3dba213 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1619,7 +1619,8 @@ static struct config_bool ConfigureNamesBool[] =
{
{"default_transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the default read-only status of new transactions."),
- NULL
+ NULL,
+ GUC_REPORT
},
&DefaultXactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8ca0583..e389a6d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -352,7 +352,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1006,6 +1006,33 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndGetTargetServerType
+ *
+ * Validate a given target_session_attrs value and get the requested server type.
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "any") == 0)
+ *requestedServerType = SERVER_TYPE_ANY;
+ else if (strcmp(optionValue, "primary") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_READ_WRITE;
+ else if (strcmp(optionValue, "read-only") == 0)
+ *requestedServerType = SERVER_TYPE_READ_ONLY;
+ else if (strcmp(optionValue, "prefer-standby") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1401,13 +1428,12 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (!validateAndGetTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid %s value: \"%s\"\n"),
- "target_settion_attrs",
+ "target_session_attrs",
conn->target_session_attrs);
return false;
}
@@ -2192,6 +2218,68 @@ connectDBComplete(PGconn *conn)
}
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (read-write/read-only).
+ * The connection state is set to try the next host (if any).
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ if (conn->requested_server_type == SERVER_TYPE_READ_WRITE)
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server\n"));
+ else
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for standby). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the primary host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedStandbyConnection(PGconn *conn)
+{
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("server is in hot standby mode\n"));
+ else
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("server is not in hot standby mode\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2273,6 +2361,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_STANDBY:
break;
default:
@@ -2309,13 +2398,33 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_primary_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to standby servers
+ * and should now try to re-connect to a
+ * previously-connected-to primary server, whose host index is
+ * recorded in which_primary_host.
+ */
+ conn->whichhost = conn->which_primary_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_primary_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3545,31 +3654,175 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
- /*
- * If a read-write connection is required, see if we have one.
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but
- * by the same token they don't have any read-only mode, so we
- * may just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if (conn->requested_server_type != SERVER_TYPE_ANY)
{
/*
- * We use PQsendQueryContinue so that conn->errorMessage
- * does not get cleared. We need to preserve any error
- * messages related to previous hosts we have tried and
- * failed to connect to.
+ * If a read-write or read-only connection is required,
+ * see if we have one.
+ *
+ * Servers before 7.4 lack the
+ * default_transaction_read_only & in_hot_standby GUC, but
+ * by the same token they don't have any read-only mode,
+ * so we may just skip the test in that case.
*/
- conn->status = CONNECTION_OK;
- if (!PQsendQueryContinue(conn,
- "SHOW transaction_read_only"))
- goto error_return;
- conn->status = CONNECTION_CHECK_WRITABLE;
- return PGRES_POLLING_READING;
+
+ if (conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * For servers which don't have
+ * "default_transaction_read_only" or "in_hot_standby"
+ * as a GUC_REPORT variable, it in necessary to
+ * determine if they are read-only by sending the
+ * query "SHOW transaction_read_only".
+ */
+ if (conn->default_transaction_read_only == GUC_BOOL_UNKNOWN ||
+ conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * We use PQsendQueryContinue so that
+ * conn->errorMessage does not get cleared. We
+ * need to preserve any error messages related to
+ * previous hosts we have tried and failed to
+ * connect to.
+ */
+ conn->status = CONNECTION_OK;
+ if (!PQsendQueryContinue(conn, "SHOW transaction_read_only"))
+ goto error_return;
+
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ return PGRES_POLLING_READING;
+ }
+ else if (((conn->default_transaction_read_only == GUC_BOOL_YES ||
+ conn->in_hot_standby == GUC_BOOL_YES) &&
+ conn->requested_server_type == SERVER_TYPE_READ_WRITE) ||
+ (conn->default_transaction_read_only == GUC_BOOL_NO &&
+ conn->in_hot_standby == GUC_BOOL_NO &&
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * Server is read-only but requested read-write,
+ * or server is read-write but requested
+ * read-only, reject and continue to process any
+ * further hosts ...
+ */
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Servers before 9.0 don't support standby mode, skip the
+ * check when the requested type of connection is primary,
+ * prefer-standby or standby.
+ */
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * For servers which don't have the "in_hot_standby"
+ * GUC_REPORT variable, it in necessary to determine
+ * if they are in hot standby mode by sending the
+ * query "SELECT pg_is_in_recovery()".
+ */
+ if (conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * We use PQsendQueryContinue so that
+ * conn->errorMessage does not get cleared. We
+ * need to preserve any error messages related to
+ * previous hosts we have tried and failed to
+ * connect to.
+ */
+ conn->status = CONNECTION_OK;
+ if (!PQsendQueryContinue(conn, "SELECT pg_is_in_recovery()"))
+ goto error_return;
+
+ conn->status = CONNECTION_CHECK_STANDBY;
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_hot_standby == GUC_BOOL_YES &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (conn->in_hot_standby == GUC_BOOL_NO &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in standby but requested primary, or
+ * server is not in standby but requested
+ * prefer-standby/standby, reject and continue to
+ * process any further hosts ...
+ */
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of
+ * the list of connections, as it couldn't
+ * find any servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ rejectCheckedStandbyConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the requested type is prefer-standby, then record
+ * this host index and try any others before considering
+ * it later. If the requested type of connection is
+ * read-only or standby, ignore this connection, as
+ * servers of this version don't support this type of
+ * connection.
+ */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the
+ * list of connections, as it couldn't find any
+ * servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
+ {
+ if (conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+ }
+
+ /*
+ * Try the next host, if any, but we don't want to
+ * consider additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3641,6 +3894,7 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
return PGRES_POLLING_OK;
}
+
case CONNECTION_CHECK_WRITABLE:
{
conn->status = CONNECTION_OK;
@@ -3658,30 +3912,112 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is
+ * read-write, ignore this connection. Server is
+ * read-write and requested type is read-only, ignore this
+ * connection.
+ */
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_ONLY)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
PQclear(res);
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+ /* Session is requested type, so we're good. */
+ PQclear(res);
- /* Append error report to conn->errorMessage. */
- appendPQExpBufferStr(&conn->errorMessage,
- libpq_gettext("session is read-only\n"));
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ /* Append error report to conn->errorMessage. */
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_STANDBY:
+ {
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_STANDBY;
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in standby mode and requested mode is
+ * primary, ignore it. Server is not in standby mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers
+ * that are in standby mode.
*/
- conn->try_next_host = true;
+ if (conn->which_primary_host == -2)
+ goto consume_checked_standby_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+
+ rejectCheckedStandbyConnection(conn);
goto keep_going;
}
-
- /* Session is read-write, so we're good. */
+ consume_checked_standby_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
/*
@@ -3693,7 +4029,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3701,7 +4037,7 @@ keep_going: /* We will come back to here until there is
/* Append error report to conn->errorMessage. */
appendPQExpBufferStr(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed\n"));
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed\n"));
/* Close connection politely. */
conn->status = CONNECTION_OK;
@@ -3711,7 +4047,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3855,10 +4190,15 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->default_transaction_read_only = GUC_BOOL_UNKNOWN;
+ conn->in_hot_standby = GUC_BOOL_UNKNOWN;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_primary_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index e730753..60cd4a7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1008,11 +1008,11 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
- * standard_conforming_strings, and convert server version to a numeric
- * form. We keep the first two of these in static variables as well, so
- * that PQescapeString and PQescapeBytea can behave somewhat sanely (at
- * least in single-connection-using programs).
+ * Special hacks: remember client_encoding, default_transaction_read_only,
+ * in_hot_standby and standard_conforming_strings, and convert server
+ * version to a numeric form. We keep the first two of these in static
+ * variables as well, so that PQescapeString and PQescapeBytea can
+ * behave somewhat sanely (at least in single-connection-using programs).
*/
if (strcmp(name, "client_encoding") == 0)
{
@@ -1062,6 +1062,14 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "default_transaction_read_only") == 0)
+ {
+ conn->default_transaction_read_only = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
+ else if (strcmp(name, "in_hot_standby") == 0)
+ {
+ conn->in_hot_standby = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index c266ad5..5c1b349 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_STANDBY /* Check whether server is in standby mode */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4db4983..0ba81ae 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,29 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_READ_WRITE, /* Read-write server */
+ SERVER_TYPE_READ_ONLY, /* Read-only server */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
+/*
+ * State of certain bool GUCs used by libpq, which are determined
+ * either by the GUC_REPORT mechanism (where supported by the server
+ * version) or by lazy evaluation (using a query sent to the server).
+ */
+typedef enum
+{
+ GUC_BOOL_UNKNOWN = 0, /* Currently unknown */
+ GUC_BOOL_YES, /* Yes (true) */
+ GUC_BOOL_NO /* No (false) */
+} GucBoolState;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +393,17 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: "any", "read-write",
+ * "read-only", "primary", "prefer-standby", "standby".
+ */
char *target_session_attrs;
+ /*
+ * The requested server type, derived from target_session_attrs.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +437,21 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first primary host encountered (if any) in the connection
+ * string. This is used during processing of requested server connection
+ * type SERVER_TYPE_PREFER_STANDBY.
+ *
+ * The initial value is -1, indicating that no primary host has yet been
+ * found. It is then set to the index of the first primary host, if one is
+ * found in the connection string during processing. If a second
+ * connection attempt is later made to that primary host (because no
+ * connection to a standby server could be made), which_primary_host is
+ * then set to -2 to avoid recursion during subsequent processing (and
+ * whichhost is set to the primary host index).
+ */
+ int which_primary_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +482,10 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ GucBoolState default_transaction_read_only; /* default_transaction_read_only
+ * GUC report variable state */
+ GucBoolState in_hot_standby; /* in_hot_standby GUC report variable
+ * state */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..15d0273 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 49;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -85,7 +85,7 @@ sub test_target_session_attrs
my $node2_port = $node2->port;
my $node2_name = $node2->name;
- my $target_name = $target_node->name;
+ my $target_name = $target_node->name if (defined $target_node);
# Build connection string for connection attempt.
my $connstr = "host=$node1_host,$node2_host ";
@@ -97,10 +97,25 @@ sub test_target_session_attrs
my ($ret, $stdout, $stderr) =
$node1->psql('postgres', 'SHOW port;',
extra_params => [ '-d', $connstr ]);
- is( $status == $ret && $stdout eq $target_node->port,
- 1,
- "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
- );
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
return;
}
@@ -121,6 +136,58 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to standby1 in "read-only" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "read-only", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Fail to connect in "read-write" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "read-write", 2);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "read-only" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "read-only", 2);
+
+# Fail to connect in "standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "standby", 2);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 1d540fe..4a9cc23 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -950,6 +950,7 @@ GroupingSetsPath
GucAction
GucBoolAssignHook
GucBoolCheckHook
+GucBoolState
GucContext
GucEnumAssignHook
GucEnumCheckHook
@@ -2512,6 +2513,7 @@ TapeShare
TarMethodData
TarMethodFile
TargetEntry
+TargetServerType
TclExceptionNameMap
Tcl_DString
Tcl_FileProc
--
1.8.3.1
Thanks for the comments Greg, please find my comments inline below.
On Tue, Feb 9, 2021 at 2:27 PM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Mon, Feb 8, 2021 at 8:17 PM vignesh C <vignesh21@gmail.com> wrote:
I think what we want to do is mark default_transaction_read_only as
GUC_REPORT, instead. That will give a reliable report of what the
state of its GUC is, and you can combine it with is_hot_standby
to decide whether the session should be considered read-only.
If you don't get those two GUC values during connection, then you
can fall back on "SHOW transaction_read_only".I have made a patch for the above with the changes suggested and
rebased it with the head code.
Attached v21 patch which has the changes for the same.
Thoughts?Further to my other doc change feedback, I can only spot the following
minor things (otherwise the changes that you have made seek OK to me).1) doc/src/sgml/protocol.sgml
<varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> were not reported by releases before
14.)should be:
<varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> were not reported by releases before
14.0)
Modified.
2) doc/src/sgml/high-availability,sgml
<para>
During hot standby, the parameter <varname>in_hot_standby</varname> and
<varname>default_transaction_read_only</varname> are always true and may
not be changed.should be:
<para>
During hot standby, the parameters <varname>in_hot_standby</varname> and
<varname>transaction_read_only</varname> are always true and may
not be changed.[I believe that there's only checks on attempts to change
"transaction_read_only" when in hot_standby, not
"default_transaction_read_only"; see check_transaction_read_only()]
Modified.
3) src/interfaces/libpq/fe-connect.c
In rejectCheckedReadOrWriteConnection() and
rejectCheckedStandbyConnection(), now that host and port info are
emitted separately and are not included in each error message string
(as parameters in a format string), I think those functions should use
appendPQExpBufferStr() instead of appendPQExpBuffer(), as it's more
efficient if there is just a single string argument.
Modified.
These comments are handled in v22 patch posted in my earlier mail.
Regards,
VIgnesh
On Wed, Feb 10, 2021 at 5:09 PM vignesh C <vignesh21@gmail.com> wrote:
Modified.
These comments are handled in v22 patch posted in my earlier mail.
Thanks, just one minor thing I missed in doc/src/sgml/libpq.sgml.
+ The support of read-write transactions is determined by the
value of the
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> configuration parameters, that is
+ reported by the server (if supported) upon successful connection. If
should be:
+ The support of read-write transactions is determined by the
values of the
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> configuration parameters, that are
+ reported by the server (if supported) upon successful connection. If
(i.e. "value" -> "values" and "is" -> "are")
Regards,
Greg Nancarrow
Fujitsu Australia
On Fri, Feb 12, 2021 at 7:07 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
On Wed, Feb 10, 2021 at 5:09 PM vignesh C <vignesh21@gmail.com> wrote:
Modified.
These comments are handled in v22 patch posted in my earlier mail.Thanks, just one minor thing I missed in doc/src/sgml/libpq.sgml.
+ The support of read-write transactions is determined by the value of the + <varname>default_transaction_read_only</varname> and + <varname>in_hot_standby</varname> configuration parameters, that is + reported by the server (if supported) upon successful connection. Ifshould be:
+ The support of read-write transactions is determined by the values of the + <varname>default_transaction_read_only</varname> and + <varname>in_hot_standby</varname> configuration parameters, that are + reported by the server (if supported) upon successful connection. If(i.e. "value" -> "values" and "is" -> "are")
Thanks for the comments, this is handled in the v23 patch attached.
Thoughts?
Regards,
Vignesh
Attachments:
v23-0001-Enhance-the-libpq-target_session_attrs-connectio.patchtext/x-patch; charset=US-ASCII; name=v23-0001-Enhance-the-libpq-target_session_attrs-connectio.patchDownload
From 45d5680adae88a5c6f9d81a5077601f036e487c5 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignesh21@gmail.com>
Date: Mon, 8 Feb 2021 11:23:31 +0530
Subject: [PATCH v23] Enhance the libpq "target_session_attrs" connection
parameter.
Enhance the connection parameter "target_session_attrs" to support new values:
read-only/primary/standby/prefer-standby.
Add a new "read-only" target_session_attrs option value, to support connecting
to a read-only server if available from the list of hosts (otherwise the
connection attempt fails).
Add a new "primary" target_session_attrs option value, to support connecting to
a server which is not in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "standby" target_session_attrs option value, to support connecting to
a server which is in hot-standby mode, if available from the list of hosts
(otherwise the connection attempt fails).
Add a new "prefer-standby" target_session_attrs option value, to support
connecting to a server which is in hot-standby mode, if available from the list
of hosts (otherwise connect to a server which is not in hot-standby mode).
Discussion: https://www.postgresql.org/message-id/flat/CAF3+xM+8-ztOkaV9gHiJ3wfgENTq97QcjXQt+rbFQ6F7oNzt9A@mail.gmail.com
---
doc/src/sgml/high-availability.sgml | 16 +-
doc/src/sgml/libpq.sgml | 73 +++++-
doc/src/sgml/protocol.sgml | 5 +-
src/backend/utils/misc/guc.c | 3 +-
src/interfaces/libpq/fe-connect.c | 432 ++++++++++++++++++++++++++++++----
src/interfaces/libpq/fe-exec.c | 18 +-
src/interfaces/libpq/libpq-fe.h | 3 +-
src/interfaces/libpq/libpq-int.h | 52 +++-
src/test/recovery/t/001_stream_rep.pl | 79 ++++++-
src/tools/pgindent/typedefs.list | 2 +
10 files changed, 602 insertions(+), 81 deletions(-)
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index f49f5c0..2bbd52c 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1700,14 +1700,14 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
</para>
<para>
- During hot standby, the parameter <varname>transaction_read_only</varname> is always
- true and may not be changed. But as long as no attempt is made to modify
- the database, connections during hot standby will act much like any other
- database connection. If failover or switchover occurs, the database will
- switch to normal processing mode. Sessions will remain connected while the
- server changes mode. Once hot standby finishes, it will be possible to
- initiate read-write transactions (even from a session begun during
- hot standby).
+ During hot standby, the parameter <varname>in_hot_standby</varname> and
+ <varname>transaction_read_only</varname> are always true and may not be
+ changed. But as long as no attempt is made to modify the database,
+ connections during hot standby will act much like any other database
+ connection. If failover or switchover occurs, the database will switch to
+ normal processing mode. Sessions will remain connected while the server
+ changes mode. Once hot standby finishes, it will be possible to initiate
+ read-write transactions (even from a session begun during hot standby).
</para>
<para>
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index b7a8245..08a1b8e 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1836,18 +1836,66 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
<term><literal>target_session_attrs</literal></term>
<listitem>
<para>
- If this parameter is set to <literal>read-write</literal>, only a
- connection in which read-write transactions are accepted by default
- is considered acceptable. The query
- <literal>SHOW transaction_read_only</literal> will be sent upon any
- successful connection; if it returns <literal>on</literal>, the connection
- will be closed. If multiple hosts were specified in the connection
- string, any remaining servers will be tried just as if the connection
- attempt had failed. The default value of this parameter,
- <literal>any</literal>, regards all connections as acceptable.
- </para>
+ The supported options for this parameter are <literal>any</literal>,
+ <literal>read-write</literal>, <literal>read-only</literal>,
+ <literal>primary</literal>, <literal>standby</literal> and
+ <literal>prefer-standby</literal>.
+ The default value of this parameter, <literal>any</literal>, regards
+ all connections as acceptable. If multiple hosts are specified in the
+ connection string, each host is tried in the order given until a connection
+ is successful.
+ </para>
+
+ <para>
+ The support of read-write transactions is determined by the values of
+ the <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> configuration parameters, that are
+ reported by the server (if supported) upon successful connection. If
+ the value of either of these parameters is <literal>on</literal>, it
+ means the server doesn't support read-write transactions. If
+ either/both of these parameters are not reported, then the support of
+ read-write transactions is determined by an explicit query, by sending
+ <literal>SHOW transaction_read_only</literal> after successful
+ connection; if it returns <literal>on</literal>, it means the server
+ doesn't support read-write transactions. The standby mode state is
+ determined by either the value of the <varname>in_hot_standby</varname>
+ configuration parameter, that is reported by the server (if supported)
+ upon successful connection, or is otherwise explicitly queried by
+ sending <literal>SELECT pg_is_in_recovery()</literal> after successful
+ connection; if it returns <literal>t</literal>, it means the server is
+ in hot standby mode.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-write</literal>, only a connection in
+ which read-write transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>read-only</literal>, only a connection in
+ which read-only transactions are accepted by default is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>primary</literal>, then only a connection in
+ which the server is not in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>standby</literal>, then only a connection in
+ which the server is in hot standby mode is considered acceptable.
+ </para>
+
+ <para>
+ If this parameter is set to <literal>prefer-standby</literal>, then a connection
+ in which the server is in hot standby mode is preferred. Otherwise, if no such
+ connections can be found, then a connection in which the server is not in hot
+ standby mode will be considered.
+ </para>
+
</listitem>
- </varlistentry>
+ </varlistentry>
+
</variablelist>
</para>
</sect2>
@@ -2150,6 +2198,7 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>server_encoding</varname>,
<varname>client_encoding</varname>,
<varname>application_name</varname>,
+ <varname>default_transaction_read_only</varname>,
<varname>in_hot_standby</varname>,
<varname>is_superuser</varname>,
<varname>session_authorization</varname>,
@@ -2165,6 +2214,7 @@ const char *PQparameterStatus(const PGconn *conn, const char *paramName);
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before
9.0;
+ <varname>default_transaction_read_only</varname> and
<varname>in_hot_standby</varname> was not reported by releases before
14.)
Note that
@@ -7268,6 +7318,7 @@ myEventProc(PGEventId evtId, void *evtInfo, void *passThrough)
linkend="libpq-connect-target-session-attrs"/> connection parameter.
</para>
</listitem>
+
</itemizedlist>
</para>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3763b4b..57d1f2b 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -1278,6 +1278,7 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>server_encoding</varname>,
<varname>client_encoding</varname>,
<varname>application_name</varname>,
+ <varname>default_transaction_read_only</varname>,
<varname>in_hot_standby</varname>,
<varname>is_superuser</varname>,
<varname>session_authorization</varname>,
@@ -1293,8 +1294,8 @@ SELCT 1/0;<!-- this typo is intentional -->
<varname>IntervalStyle</varname> was not reported by releases before 8.4;
<varname>application_name</varname> was not reported by releases before
9.0;
- <varname>in_hot_standby</varname> was not reported by releases before
- 14.)
+ <varname>default_transaction_read_only</varname> and
+ <varname>in_hot_standby</varname> were not reported by releases before 14.0)
Note that
<varname>server_version</varname>,
<varname>server_encoding</varname> and
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index eafdb11..3dba213 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -1619,7 +1619,8 @@ static struct config_bool ConfigureNamesBool[] =
{
{"default_transaction_read_only", PGC_USERSET, CLIENT_CONN_STATEMENT,
gettext_noop("Sets the default read-only status of new transactions."),
- NULL
+ NULL,
+ GUC_REPORT
},
&DefaultXactReadOnly,
false,
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 8ca0583..e389a6d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -352,7 +352,7 @@ static const internalPQconninfoOption PQconninfoOptions[] = {
{"target_session_attrs", "PGTARGETSESSIONATTRS",
DefaultTargetSessionAttrs, NULL,
- "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+ "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */
offsetof(struct pg_conn, target_session_attrs)},
/* Terminating entry --- MUST BE LAST */
@@ -1006,6 +1006,33 @@ parse_comma_separated_list(char **startptr, bool *more)
}
/*
+ * validateAndGetTargetServerType
+ *
+ * Validate a given target_session_attrs value and get the requested server type.
+ *
+ * Returns true if OK, false if the specified option value is invalid.
+ */
+static bool
+validateAndGetTargetServerType(const char *optionValue, TargetServerType *requestedServerType)
+{
+ if (strcmp(optionValue, "any") == 0)
+ *requestedServerType = SERVER_TYPE_ANY;
+ else if (strcmp(optionValue, "primary") == 0)
+ *requestedServerType = SERVER_TYPE_PRIMARY;
+ else if (strcmp(optionValue, "read-write") == 0)
+ *requestedServerType = SERVER_TYPE_READ_WRITE;
+ else if (strcmp(optionValue, "read-only") == 0)
+ *requestedServerType = SERVER_TYPE_READ_ONLY;
+ else if (strcmp(optionValue, "prefer-standby") == 0)
+ *requestedServerType = SERVER_TYPE_PREFER_STANDBY;
+ else if (strcmp(optionValue, "standby") == 0)
+ *requestedServerType = SERVER_TYPE_STANDBY;
+ else
+ return false;
+ return true;
+}
+
+/*
* connectOptions2
*
* Compute derived connection options after absorbing all user-supplied info.
@@ -1401,13 +1428,12 @@ connectOptions2(PGconn *conn)
*/
if (conn->target_session_attrs)
{
- if (strcmp(conn->target_session_attrs, "any") != 0
- && strcmp(conn->target_session_attrs, "read-write") != 0)
+ if (!validateAndGetTargetServerType(conn->target_session_attrs, &conn->requested_server_type))
{
conn->status = CONNECTION_BAD;
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid %s value: \"%s\"\n"),
- "target_settion_attrs",
+ "target_session_attrs",
conn->target_session_attrs);
return false;
}
@@ -2192,6 +2218,68 @@ connectDBComplete(PGconn *conn)
}
}
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (read-write/read-only).
+ * The connection state is set to try the next host (if any).
+ */
+static void
+rejectCheckedReadOrWriteConnection(PGconn *conn)
+{
+ if (conn->requested_server_type == SERVER_TYPE_READ_WRITE)
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("could not make a writable "
+ "connection to server\n"));
+ else
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("could not make a readonly "
+ "connection to server\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
+/*
+ * Internal helper function used for rejecting (and closing) a connection that
+ * doesn't satisfy the requested server type (for standby). The connection state
+ * is set to try the next host (if any).
+ * In the case of SERVER_TYPE_PREFER_STANDBY, if the primary host-index hasn't
+ * been set, then it is set to the index of this connection's host, so that a
+ * connection to this host can be made again in the event that no connection to
+ * a standby host could be made after the first host scan.
+ */
+static void
+rejectCheckedStandbyConnection(PGconn *conn)
+{
+ if (conn->requested_server_type == SERVER_TYPE_PRIMARY)
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("server is in hot standby mode\n"));
+ else
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("server is not in hot standby mode\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record primary host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY && conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+
+ /*
+ * Try next host if any, but we don't want to consider additional
+ * addresses for this host.
+ */
+ conn->try_next_host = true;
+}
+
/* ----------------
* PQconnectPoll
*
@@ -2273,6 +2361,7 @@ PQconnectPoll(PGconn *conn)
case CONNECTION_CHECK_WRITABLE:
case CONNECTION_CONSUME:
case CONNECTION_GSS_STARTUP:
+ case CONNECTION_CHECK_STANDBY:
break;
default:
@@ -2309,13 +2398,33 @@ keep_going: /* We will come back to here until there is
if (conn->whichhost + 1 >= conn->nconnhost)
{
- /*
- * Oops, no more hosts. An appropriate error message is already
- * set up, so just set the right status.
- */
- goto error_return;
+ if (conn->which_primary_host >= 0)
+ {
+ /*
+ * Getting here means we failed to connect to standby servers
+ * and should now try to re-connect to a
+ * previously-connected-to primary server, whose host index is
+ * recorded in which_primary_host.
+ */
+ conn->whichhost = conn->which_primary_host;
+
+ /*
+ * Reset the host index value to avoid recursion during the
+ * second connection attempt.
+ */
+ conn->which_primary_host = -2;
+ }
+ else
+ {
+ /*
+ * Oops, no more hosts. An appropriate error message is
+ * already set up, so just set the right status.
+ */
+ goto error_return;
+ }
}
- conn->whichhost++;
+ else
+ conn->whichhost++;
/* Drop any address info for previous host */
release_conn_addrinfo(conn);
@@ -3545,31 +3654,175 @@ keep_going: /* We will come back to here until there is
case CONNECTION_CHECK_TARGET:
{
- /*
- * If a read-write connection is required, see if we have one.
- *
- * Servers before 7.4 lack the transaction_read_only GUC, but
- * by the same token they don't have any read-only mode, so we
- * may just skip the test in that case.
- */
- if (conn->sversion >= 70400 &&
- conn->target_session_attrs != NULL &&
- strcmp(conn->target_session_attrs, "read-write") == 0)
+ if (conn->requested_server_type != SERVER_TYPE_ANY)
{
/*
- * We use PQsendQueryContinue so that conn->errorMessage
- * does not get cleared. We need to preserve any error
- * messages related to previous hosts we have tried and
- * failed to connect to.
+ * If a read-write or read-only connection is required,
+ * see if we have one.
+ *
+ * Servers before 7.4 lack the
+ * default_transaction_read_only & in_hot_standby GUC, but
+ * by the same token they don't have any read-only mode,
+ * so we may just skip the test in that case.
*/
- conn->status = CONNECTION_OK;
- if (!PQsendQueryContinue(conn,
- "SHOW transaction_read_only"))
- goto error_return;
- conn->status = CONNECTION_CHECK_WRITABLE;
- return PGRES_POLLING_READING;
+
+ if (conn->sversion >= 70400 &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * For servers which don't have
+ * "default_transaction_read_only" or "in_hot_standby"
+ * as a GUC_REPORT variable, it in necessary to
+ * determine if they are read-only by sending the
+ * query "SHOW transaction_read_only".
+ */
+ if (conn->default_transaction_read_only == GUC_BOOL_UNKNOWN ||
+ conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * We use PQsendQueryContinue so that
+ * conn->errorMessage does not get cleared. We
+ * need to preserve any error messages related to
+ * previous hosts we have tried and failed to
+ * connect to.
+ */
+ conn->status = CONNECTION_OK;
+ if (!PQsendQueryContinue(conn, "SHOW transaction_read_only"))
+ goto error_return;
+
+ conn->status = CONNECTION_CHECK_WRITABLE;
+ return PGRES_POLLING_READING;
+ }
+ else if (((conn->default_transaction_read_only == GUC_BOOL_YES ||
+ conn->in_hot_standby == GUC_BOOL_YES) &&
+ conn->requested_server_type == SERVER_TYPE_READ_WRITE) ||
+ (conn->default_transaction_read_only == GUC_BOOL_NO &&
+ conn->in_hot_standby == GUC_BOOL_NO &&
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY))
+ {
+ /*
+ * Server is read-only but requested read-write,
+ * or server is read-write but requested
+ * read-only, reject and continue to process any
+ * further hosts ...
+ */
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * Servers before 9.0 don't support standby mode, skip the
+ * check when the requested type of connection is primary,
+ * prefer-standby or standby.
+ */
+ else if ((conn->sversion >= 90000 &&
+ (conn->requested_server_type == SERVER_TYPE_PRIMARY ||
+ conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * For servers which don't have the "in_hot_standby"
+ * GUC_REPORT variable, it in necessary to determine
+ * if they are in hot standby mode by sending the
+ * query "SELECT pg_is_in_recovery()".
+ */
+ if (conn->in_hot_standby == GUC_BOOL_UNKNOWN)
+ {
+ /*
+ * We use PQsendQueryContinue so that
+ * conn->errorMessage does not get cleared. We
+ * need to preserve any error messages related to
+ * previous hosts we have tried and failed to
+ * connect to.
+ */
+ conn->status = CONNECTION_OK;
+ if (!PQsendQueryContinue(conn, "SELECT pg_is_in_recovery()"))
+ goto error_return;
+
+ conn->status = CONNECTION_CHECK_STANDBY;
+ return PGRES_POLLING_READING;
+ }
+ else if ((conn->in_hot_standby == GUC_BOOL_YES &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (conn->in_hot_standby == GUC_BOOL_NO &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
+ /*
+ * Server is in standby but requested primary, or
+ * server is not in standby but requested
+ * prefer-standby/standby, reject and continue to
+ * process any further hosts ...
+ */
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of
+ * the list of connections, as it couldn't
+ * find any servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ rejectCheckedStandbyConnection(conn);
+ goto keep_going;
+ }
+
+ /* obtained the requested type, consume it */
+ goto consume_checked_target_connection;
+ }
+
+ /*
+ * If the requested type is prefer-standby, then record
+ * this host index and try any others before considering
+ * it later. If the requested type of connection is
+ * read-only or standby, ignore this connection, as
+ * servers of this version don't support this type of
+ * connection.
+ */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_READ_ONLY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)
+ {
+ if (conn->which_primary_host == -2)
+ {
+ /*
+ * This scenario is possible only for the
+ * prefer-standby type for the next pass of the
+ * list of connections, as it couldn't find any
+ * servers that are in standby mode.
+ */
+ goto consume_checked_target_connection;
+ }
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Record host index */
+ if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY)
+ {
+ if (conn->which_primary_host == -1)
+ conn->which_primary_host = conn->whichhost;
+ }
+
+ /*
+ * Try the next host, if any, but we don't want to
+ * consider additional addresses for this host.
+ */
+ conn->try_next_host = true;
+ goto keep_going;
+ }
}
+ consume_checked_target_connection:
+
/* We can release the address list now. */
release_conn_addrinfo(conn);
@@ -3641,6 +3894,7 @@ keep_going: /* We will come back to here until there is
conn->status = CONNECTION_OK;
return PGRES_POLLING_OK;
}
+
case CONNECTION_CHECK_WRITABLE:
{
conn->status = CONNECTION_OK;
@@ -3658,30 +3912,112 @@ keep_going: /* We will come back to here until there is
PQntuples(res) == 1)
{
char *val;
+ bool readonly_server;
val = PQgetvalue(res, 0, 0);
- if (strncmp(val, "on", 2) == 0)
+ readonly_server = (strncmp(val, "on", 2) == 0);
+
+ /*
+ * Server is read-only and requested server type is
+ * read-write, ignore this connection. Server is
+ * read-write and requested type is read-only, ignore this
+ * connection.
+ */
+ if ((readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_WRITE)) ||
+ (!readonly_server &&
+ (conn->requested_server_type == SERVER_TYPE_READ_ONLY)))
{
- /* Not writable; fail this connection. */
+ /* Not a requested type; fail this connection. */
PQclear(res);
+ rejectCheckedReadOrWriteConnection(conn);
+ goto keep_going;
+ }
+ /* Session is requested type, so we're good. */
+ PQclear(res);
- /* Append error report to conn->errorMessage. */
- appendPQExpBufferStr(&conn->errorMessage,
- libpq_gettext("session is read-only\n"));
+ /*
+ * Finish reading any remaining messages before being
+ * considered as ready.
+ */
+ conn->status = CONNECTION_CONSUME;
+ goto keep_going;
+ }
- /* Close connection politely. */
- conn->status = CONNECTION_OK;
- sendTerminateConn(conn);
+ /*
+ * Something went wrong with "SHOW transaction_read_only". We
+ * should try next addresses.
+ */
+ if (res)
+ PQclear(res);
+ /* Append error report to conn->errorMessage. */
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("test \"SHOW transaction_read_only\" failed\n"));
+
+ /* Close connection politely. */
+ conn->status = CONNECTION_OK;
+ sendTerminateConn(conn);
+
+ /* Try next address */
+ conn->try_next_addr = true;
+ goto keep_going;
+ }
+
+ case CONNECTION_CHECK_STANDBY:
+ {
+ conn->status = CONNECTION_OK;
+ if (!PQconsumeInput(conn))
+ {
+ goto error_return;
+ }
+
+ if (PQisBusy(conn))
+ {
+ conn->status = CONNECTION_CHECK_STANDBY;
+ return PGRES_POLLING_READING;
+ }
+
+ res = PQgetResult(conn);
+ if (res && (PQresultStatus(res) == PGRES_TUPLES_OK) &&
+ PQntuples(res) == 1)
+ {
+ char *val;
+ bool standby_server;
+
+ val = PQgetvalue(res, 0, 0);
+ standby_server = (strncmp(val, "t", 1) == 0);
+
+ /*
+ * Server is in standby mode and requested mode is
+ * primary, ignore it. Server is not in standby mode and
+ * requested mode is prefer-standby, record it for the
+ * first time and try to consume in the next scan (it
+ * means no standby server was found in the first scan).
+ */
+ if ((standby_server &&
+ conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+ (!standby_server &&
+ (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+ conn->requested_server_type == SERVER_TYPE_STANDBY)))
+ {
/*
- * Try next host if any, but we don't want to consider
- * additional addresses for this host.
+ * The following scenario is possible only for the
+ * prefer-standby mode for the next pass of the list
+ * of connections, as it couldn't find any servers
+ * that are in standby mode.
*/
- conn->try_next_host = true;
+ if (conn->which_primary_host == -2)
+ goto consume_checked_standby_connection;
+
+ /* Not a requested type; fail this connection. */
+ PQclear(res);
+
+ rejectCheckedStandbyConnection(conn);
goto keep_going;
}
-
- /* Session is read-write, so we're good. */
+ consume_checked_standby_connection:
+ /* Session is requested type, so we're good. */
PQclear(res);
/*
@@ -3693,7 +4029,7 @@ keep_going: /* We will come back to here until there is
}
/*
- * Something went wrong with "SHOW transaction_read_only". We
+ * Something went wrong with "SELECT pg_is_in_recovery()". We
* should try next addresses.
*/
if (res)
@@ -3701,7 +4037,7 @@ keep_going: /* We will come back to here until there is
/* Append error report to conn->errorMessage. */
appendPQExpBufferStr(&conn->errorMessage,
- libpq_gettext("test \"SHOW transaction_read_only\" failed\n"));
+ libpq_gettext("test \"SELECT pg_is_in_recovery()\" failed\n"));
/* Close connection politely. */
conn->status = CONNECTION_OK;
@@ -3711,7 +4047,6 @@ keep_going: /* We will come back to here until there is
conn->try_next_addr = true;
goto keep_going;
}
-
default:
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("invalid connection state %d, "
@@ -3855,10 +4190,15 @@ makeEmptyPGconn(void)
conn->setenv_state = SETENV_STATE_IDLE;
conn->client_encoding = PG_SQL_ASCII;
conn->std_strings = false; /* unless server says differently */
+ conn->default_transaction_read_only = GUC_BOOL_UNKNOWN;
+ conn->in_hot_standby = GUC_BOOL_UNKNOWN;
conn->verbosity = PQERRORS_DEFAULT;
conn->show_context = PQSHOW_CONTEXT_ERRORS;
conn->sock = PGINVALID_SOCKET;
+ conn->requested_server_type = SERVER_TYPE_ANY;
+ conn->which_primary_host = -1;
+
/*
* We try to send at least 8K at a time, which is the usual size of pipe
* buffers on Unix systems. That way, when we are sending a large amount
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index e730753..60cd4a7 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1008,11 +1008,11 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
}
/*
- * Special hacks: remember client_encoding and
- * standard_conforming_strings, and convert server version to a numeric
- * form. We keep the first two of these in static variables as well, so
- * that PQescapeString and PQescapeBytea can behave somewhat sanely (at
- * least in single-connection-using programs).
+ * Special hacks: remember client_encoding, default_transaction_read_only,
+ * in_hot_standby and standard_conforming_strings, and convert server
+ * version to a numeric form. We keep the first two of these in static
+ * variables as well, so that PQescapeString and PQescapeBytea can
+ * behave somewhat sanely (at least in single-connection-using programs).
*/
if (strcmp(name, "client_encoding") == 0)
{
@@ -1062,6 +1062,14 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
else
conn->sversion = 0; /* unknown */
}
+ else if (strcmp(name, "default_transaction_read_only") == 0)
+ {
+ conn->default_transaction_read_only = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
+ else if (strcmp(name, "in_hot_standby") == 0)
+ {
+ conn->in_hot_standby = (strcmp(value, "on") == 0 ? GUC_BOOL_YES : GUC_BOOL_NO);
+ }
}
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index effe0cc..330b948 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -68,7 +68,8 @@ typedef enum
CONNECTION_CONSUME, /* Wait for any pending message and consume
* them. */
CONNECTION_GSS_STARTUP, /* Negotiating GSSAPI. */
- CONNECTION_CHECK_TARGET /* Check if we have a proper target connection */
+ CONNECTION_CHECK_TARGET, /* Check if we have a proper target connection */
+ CONNECTION_CHECK_STANDBY /* Check whether server is in standby mode */
} ConnStatusType;
typedef enum
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4db4983..0ba81ae 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -317,6 +317,29 @@ typedef struct pg_conn_host
* found in password file. */
} pg_conn_host;
+/* Target server type to connect to */
+typedef enum
+{
+ SERVER_TYPE_ANY = 0, /* Any server (default) */
+ SERVER_TYPE_READ_WRITE, /* Read-write server */
+ SERVER_TYPE_READ_ONLY, /* Read-only server */
+ SERVER_TYPE_PRIMARY, /* Primary server */
+ SERVER_TYPE_PREFER_STANDBY, /* Prefer Standby server */
+ SERVER_TYPE_STANDBY /* Standby server */
+} TargetServerType;
+
+/*
+ * State of certain bool GUCs used by libpq, which are determined
+ * either by the GUC_REPORT mechanism (where supported by the server
+ * version) or by lazy evaluation (using a query sent to the server).
+ */
+typedef enum
+{
+ GUC_BOOL_UNKNOWN = 0, /* Currently unknown */
+ GUC_BOOL_YES, /* Yes (true) */
+ GUC_BOOL_NO /* No (false) */
+} GucBoolState;
+
/*
* PGconn stores all the state data associated with a single connection
* to a backend.
@@ -370,9 +393,17 @@ struct pg_conn
char *ssl_min_protocol_version; /* minimum TLS protocol version */
char *ssl_max_protocol_version; /* maximum TLS protocol version */
- /* Type of connection to make. Possible values: any, read-write. */
+ /*
+ * Type of connection to make. Possible values: "any", "read-write",
+ * "read-only", "primary", "prefer-standby", "standby".
+ */
char *target_session_attrs;
+ /*
+ * The requested server type, derived from target_session_attrs.
+ */
+ TargetServerType requested_server_type;
+
/* Optional file to write trace info to */
FILE *Pfdebug;
@@ -406,6 +437,21 @@ struct pg_conn
pg_conn_host *connhost; /* details about each named host */
char *connip; /* IP address for current network connection */
+ /*
+ * Index of the first primary host encountered (if any) in the connection
+ * string. This is used during processing of requested server connection
+ * type SERVER_TYPE_PREFER_STANDBY.
+ *
+ * The initial value is -1, indicating that no primary host has yet been
+ * found. It is then set to the index of the first primary host, if one is
+ * found in the connection string during processing. If a second
+ * connection attempt is later made to that primary host (because no
+ * connection to a standby server could be made), which_primary_host is
+ * then set to -2 to avoid recursion during subsequent processing (and
+ * whichhost is set to the primary host index).
+ */
+ int which_primary_host;
+
/* Connection data */
pgsocket sock; /* FD for socket, PGINVALID_SOCKET if
* unconnected */
@@ -436,6 +482,10 @@ struct pg_conn
pgParameterStatus *pstatus; /* ParameterStatus data */
int client_encoding; /* encoding id */
bool std_strings; /* standard_conforming_strings */
+ GucBoolState default_transaction_read_only; /* default_transaction_read_only
+ * GUC report variable state */
+ GucBoolState in_hot_standby; /* in_hot_standby GUC report variable
+ * state */
PGVerbosity verbosity; /* error/notice message verbosity */
PGContextVisibility show_context; /* whether to show CONTEXT field */
PGlobjfuncs *lobjfuncs; /* private state for large-object access fns */
diff --git a/src/test/recovery/t/001_stream_rep.pl b/src/test/recovery/t/001_stream_rep.pl
index 9e31a53..15d0273 100644
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -3,7 +3,7 @@ use strict;
use warnings;
use PostgresNode;
use TestLib;
-use Test::More tests => 36;
+use Test::More tests => 49;
# Initialize primary node
my $node_primary = get_new_node('primary');
@@ -85,7 +85,7 @@ sub test_target_session_attrs
my $node2_port = $node2->port;
my $node2_name = $node2->name;
- my $target_name = $target_node->name;
+ my $target_name = $target_node->name if (defined $target_node);
# Build connection string for connection attempt.
my $connstr = "host=$node1_host,$node2_host ";
@@ -97,10 +97,25 @@ sub test_target_session_attrs
my ($ret, $stdout, $stderr) =
$node1->psql('postgres', 'SHOW port;',
extra_params => [ '-d', $connstr ]);
- is( $status == $ret && $stdout eq $target_node->port,
- 1,
- "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
- );
+ if ($status == 0)
+ {
+ is( $status == $ret && $stdout eq $target_node->port,
+ 1,
+ "connect to node $target_name if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
+ else
+ {
+ print "status = $status\n";
+ print "ret = $ret\n";
+ print "stdout = $stdout\n";
+ print "stderr = $stderr\n";
+
+ is( $status == $ret,
+ 1,
+ "fail to connect to any nodes if mode \"$mode\" and $node1_name,$node2_name listed"
+ );
+ }
return;
}
@@ -121,6 +136,58 @@ test_target_session_attrs($node_primary, $node_standby_1, $node_primary, "any",
test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
"any", 0);
+# Connect to primary in "primary" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_primary,
+ "primary", 0);
+
+# Connect to primary in "primary" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_primary,
+ "primary", 0);
+
+# Connect to standby1 in "read-only" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "read-only", 0);
+
+# Connect to standby1 in "read-only" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "read-only", 0);
+
+# Connect to primary in "prefer-standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, $node_primary,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "prefer-standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "prefer-standby", 0);
+
+# Connect to standby1 in "standby" mode with primary,standby1 list.
+test_target_session_attrs($node_primary, $node_standby_1, $node_standby_1,
+ "standby", 0);
+
+# Connect to standby1 in "standby" mode with standby1,primary list.
+test_target_session_attrs($node_standby_1, $node_primary, $node_standby_1,
+ "standby", 0);
+
+# Fail to connect in "read-write" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "read-write", 2);
+
+# Fail to connect in "primary" mode with standby1,standby2 list.
+test_target_session_attrs($node_standby_1, $node_standby_2, undef,
+ "primary", 2);
+
+# Fail to connect in "read-only" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "read-only", 2);
+
+# Fail to connect in "standby" mode with primary,primary list.
+test_target_session_attrs($node_primary, $node_primary, undef,
+ "standby", 2);
+
# Test for SHOW commands using a WAL sender connection with a replication
# role.
note "testing SHOW commands for replication connection";
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index bab4f3a..5b9e632 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -950,6 +950,7 @@ GroupingSetsPath
GucAction
GucBoolAssignHook
GucBoolCheckHook
+GucBoolState
GucContext
GucEnumAssignHook
GucEnumCheckHook
@@ -2512,6 +2513,7 @@ TapeShare
TarMethodData
TarMethodFile
TargetEntry
+TargetServerType
TclExceptionNameMap
Tcl_DString
Tcl_FileProc
--
1.8.3.1
On Fri, Feb 12, 2021 at 2:42 PM vignesh C <vignesh21@gmail.com> wrote:
Thanks, just one minor thing I missed in doc/src/sgml/libpq.sgml.
+ The support of read-write transactions is determined by the value of the + <varname>default_transaction_read_only</varname> and + <varname>in_hot_standby</varname> configuration parameters, that is + reported by the server (if supported) upon successful connection. Ifshould be:
+ The support of read-write transactions is determined by the values of the + <varname>default_transaction_read_only</varname> and + <varname>in_hot_standby</varname> configuration parameters, that are + reported by the server (if supported) upon successful connection. If(i.e. "value" -> "values" and "is" -> "are")
Thanks for the comments, this is handled in the v23 patch attached.
Thoughts?
I've marked this as "Ready for Committer".
(and also added you to the author list)
Regards,
Greg Nancarrow
Fujitsu Australia
Greg Nancarrow <gregn4422@gmail.com> writes:
I've marked this as "Ready for Committer".
I've pushed this after whacking it around a fair amount. A lot of
that was cosmetic, but one thing that wasn't is that I got rid of the
proposed "which_primary_host" variable. I thought the logic around
that was way too messy and probably buggy. Even if it worked exactly
as intended, I'm dubious that the design intention was good. I think
it makes more sense just to go through the whole server list again
without the restriction to standby servers. In particular, that will
give saner results if the servers' status isn't holding still.
regards, tom lane
On Wed, Mar 3, 2021 at 7:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Greg Nancarrow <gregn4422@gmail.com> writes:
I've marked this as "Ready for Committer".
I've pushed this after whacking it around a fair amount. A lot of
that was cosmetic, but one thing that wasn't is that I got rid of the
proposed "which_primary_host" variable. I thought the logic around
that was way too messy and probably buggy. Even if it worked exactly
as intended, I'm dubious that the design intention was good. I think
it makes more sense just to go through the whole server list again
without the restriction to standby servers. In particular, that will
give saner results if the servers' status isn't holding still.
Buildfarm machine crake and conchuela have failed after this commit.
I had checked the failures, crake is failing because of:
Mar 02 21:22:56 ./src/test/recovery/t/001_stream_rep.pl: Variable declared
in conditional statement at line 88, column 2. Declare variables outside
of the condition. ([Variables::ProhibitConditionalDeclarations] Severity:
5)
I have analyzed and posted a patch at [1]/messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-@mail.gmail.com </messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-OUCVA@mail.gmail.com> OUCVA </messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-OUCVA@mail.gmail.com> for this. That might fix this
problem.
Conchuela is failing because of:
ok 17 - connect to node standby_1 if mode "standby" and standby_1,primary
listed
ack Broken pipe: write( 13, 'SHOW port;' ) at
/usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
### Stopping node "primary" using mode immediate
# Running: pg_ctl -D
/home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_001_stream_rep_primary_data/pgdata
-m immediate stop
waiting for server to shut down... done
I could not find the exact reason for this failure, I'm checking further on
why it is failing.
Thoughts?
[1]: /messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-@mail.gmail.com </messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-OUCVA@mail.gmail.com> OUCVA </messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-OUCVA@mail.gmail.com>
/messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-@mail.gmail.com
</messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-OUCVA@mail.gmail.com>
OUCVA
</messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-OUCVA@mail.gmail.com>
Regards,
Vignesh
On Wed, Mar 3, 2021 at 2:49 PM vignesh C <vignesh21@gmail.com> wrote:
On Wed, Mar 3, 2021 at 7:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Greg Nancarrow <gregn4422@gmail.com> writes:
I've marked this as "Ready for Committer".
I've pushed this after whacking it around a fair amount. A lot of
that was cosmetic, but one thing that wasn't is that I got rid of the
proposed "which_primary_host" variable. I thought the logic around
that was way too messy and probably buggy. Even if it worked exactly
as intended, I'm dubious that the design intention was good. I think
it makes more sense just to go through the whole server list again
without the restriction to standby servers. In particular, that will
give saner results if the servers' status isn't holding still.Buildfarm machine crake and conchuela have failed after this commit.
I had checked the failures, crake is failing because of:
Mar 02 21:22:56 ./src/test/recovery/t/001_stream_rep.pl: Variable
declared in conditional statement at line 88, column 2. Declare variables
outside of the condition. ([Variables::ProhibitConditionalDeclarations]
Severity: 5)
I have analyzed and posted a patch at [1] for this. That might fix this
problem.
Conchuela is failing because of:
ok 17 - connect to node standby_1 if mode "standby" and standby_1,primary
listed
ack Broken pipe: write( 13, 'SHOW port;' ) at
/usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
### Stopping node "primary" using mode immediate
# Running: pg_ctl -D
/home/pgbf/buildroot/HEAD/pgsql.build/src/test/recovery/tmp_check/t_001_stream_rep_primary_data/pgdata
-m immediate stop
waiting for server to shut down... done
I could not find the exact reason for this failure, I'm checking further
on why it is failing.
Thoughts?
[1] -
/messages/by-id/CALDaNm3L=ROeb=4rKf0XMN0CqrEnn6T=-44m4fsDAhcw-@mail.gmail.com
OUCVA
At least the first problem seems to possibly be because of:
The buildfarm machine crake is using Perl 5.30.3.
Regards,
Greg Nancarrow
Fujitsu Australia
vignesh C <vignesh21@gmail.com> writes:
Conchuela is failing because of:
ok 17 - connect to node standby_1 if mode "standby" and standby_1,primary
listed
ack Broken pipe: write( 13, 'SHOW port;' ) at
/usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
It didn't fail on the next run, so this might just be a phase-of-the-moon
glitch. Conchuela is a bit prone to that sort of thing, in my experience.
We'll have to wait and see if it's at all repeatable.
regards, tom lane
I wrote:
vignesh C <vignesh21@gmail.com> writes:
Conchuela is failing because of:
ok 17 - connect to node standby_1 if mode "standby" and standby_1,primary
listed
ack Broken pipe: write( 13, 'SHOW port;' ) at
/usr/local/lib/perl5/site_perl/IPC/Run/IO.pm line 549.
It didn't fail on the next run, so this might just be a phase-of-the-moon
glitch. Conchuela is a bit prone to that sort of thing, in my experience.
We'll have to wait and see if it's at all repeatable.
Conchuela hasn't shown it again, but it turns out to be repeatable
on my old warhorse gaur. After a bit of study I see the problem:
we're asking Perl to write to the stdin of a psql process that
may not be there to receive the data. We've dodged that issue
in other tests by passing "undef" as the stdin to sub psql, so
that's what I did here.
regards, tom lane