[PATCH] pgbench: add multiconnect option
-hackers,
This patch adds the concept of "multiconnect" to pgbench (better
terminology welcome). The basic idea here is to allow connections made
with pgbench to use different auth values or connect to multiple
databases. We implement this using a user-provided PGSERVICEFILE and
choosing a PGSERVICE from this based on a number of strategies.
(Currently the only supported strategies are round robin or random.)
There is definite room for improvement here; at the very least, teaching
`pgbench -i` about all of the distinct DBs referenced in this service
file would ensure that initialization works as expected in all places.
For now, we are punting initialization to the user in this version of
the patch if using more that one database in the given service file.
Best,
David
Attachments:
pgbench-add-multiconnect.patchtext/x-patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0c60077e1f..94616c13c2 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -161,6 +161,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<envar>PGDATABASE</envar> is used. If that is not set, the
user name specified for the connection is used.
</para>
+ <para>
+ If <literal>multiconnect</literal> mode is enabled, a defined
+ <literal>dbname</literal> in the chosen service will override this
+ value.
+ </para>
</listitem>
</varlistentry>
@@ -840,6 +845,39 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<variablelist>
+ <varlistentry>
+ <term><option>-m</option> <replaceable>servicefile</replaceable></term>
+ <term><option>--multiconnect-file=</option><replaceable>servicefile</replaceable></term>
+ <listitem>
+ <para>
+ Turns on <literal>multiconnect</literal> mode and uses the given
+ <literal>pg_service</literal>-style file to derive connection
+ information from. Any/all connection parameters in this file will
+ overwrite any that were provided in the command-line.
+ </para>
+ <para>
+ Since this behavior will make a connection using
+ the <envar>PGSERVICEFILE</envar> mechanism, it is possible to
+ connect to other databases than the one provided in the original
+ command invocation. This option assumes that the user has previously
+ run the necessarily initialization steps against all databases that
+ would be accessed via this service file.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-g roundrobin|random</option></term>
+ <term><option>--multiconnect-strategy=roundrobin|random</option></term>
+ <listitem>
+ <para>
+ Selects the strategy by which <literal>multiconnect</literal> mode
+ uses the connections defined in the indicated service file. The
+ default value is <literal>roundrobin</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-h</option> <replaceable>hostname</replaceable></term>
<term><option>--host=</option><replaceable>hostname</replaceable></term>
@@ -847,6 +885,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<para>
The database server's host name
</para>
+ <para>
+ If <literal>multiconnect</literal> mode is enabled, a defined
+ <literal>host</literal> in the chosen service will override this
+ value.
+ </para>
</listitem>
</varlistentry>
@@ -857,6 +900,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<para>
The database server's port number
</para>
+ <para>
+ If <literal>multiconnect</literal> mode is enabled, a defined
+ <literal>port</literal> in the chosen service will override this
+ value.
+ </para>
</listitem>
</varlistentry>
@@ -867,6 +915,11 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<para>
The user name to connect as
</para>
+ <para>
+ If <literal>multiconnect</literal> mode is enabled, a defined
+ <literal>user</literal> in the chosen service will override this
+ value.
+ </para>
</listitem>
</varlistentry>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4aeccd93af..2834c9ef3c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -69,6 +69,7 @@
#include "pgbench.h"
#include "port/pg_bitutils.h"
#include "portability/instr_time.h"
+#include "lib/stringinfo.h"
#ifndef M_PI
#define M_PI 3.14159265358979323846
@@ -275,6 +276,8 @@ int nthreads = 1; /* number of threads */
bool is_connect; /* establish connection for each transaction */
bool report_per_command; /* report per-command latencies */
int main_pid; /* main process id used in log filename */
+int num_service_names = 0; /* how many service file names are in the indicated service file */
+int cur_service_index = 0; /* the index of the next service file; used for round-robin */
const char *pghost = NULL;
const char *pgport = NULL;
@@ -282,6 +285,7 @@ const char *username = NULL;
const char *dbName = NULL;
char *logfile_prefix = NULL;
const char *progname;
+const char **service_names = NULL;
#define WSEP '@' /* weight separator */
@@ -549,6 +553,14 @@ typedef enum QueryMode
static QueryMode querymode = QUERY_SIMPLE;
static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
+typedef enum MultiConnectStrategy
+{
+ MC_ROUND_ROBIN,
+ MC_RANDOM
+} MultiConnectStrategy;
+
+static MultiConnectStrategy multiconnect_strategy = MC_ROUND_ROBIN;
+
/*
* struct Command represents one command in a script.
*
@@ -663,7 +675,7 @@ static void clear_socket_set(socket_set *sa);
static void add_socket_to_set(socket_set *sa, int fd, int idx);
static int wait_on_socket_set(socket_set *sa, int64 usecs);
static bool socket_has_input(socket_set *sa, int fd, int idx);
-
+static const char **availableServiceEntries(const char *serviceFile);
/* callback functions for our flex lexer */
static const PsqlScanCallbacks pgbench_callbacks = {
@@ -727,6 +739,10 @@ usage(void)
" -j, --jobs=NUM number of threads (default: 1)\n"
" -l, --log write transaction times to log file\n"
" -L, --latency-limit=NUM count transactions lasting more than NUM ms as late\n"
+ " -m, --multiconnect=FILE use multiple auth defined in the given service file\n"
+ " -g, --multiconnect-strategy=roundrobin|random\n"
+ " use the given strategy for choosing the service to connect as\n"
+ " (default: roundrobin)\n"
" -M, --protocol=simple|extended|prepared\n"
" protocol for submitting queries (default: simple)\n"
" -n, --no-vacuum do not run VACUUM before tests\n"
@@ -1351,6 +1367,33 @@ doConnect(void)
bool new_pass;
static char *password = NULL;
+ /*
+ * If we are doing a round-robin of service files names, then use/choose the next name
+ */
+ if (num_service_names) {
+ const char *service;
+
+ if (multiconnect_strategy == MC_ROUND_ROBIN)
+ {
+ service = service_names[cur_service_index++];
+
+ if (cur_service_index >= num_service_names)
+ cur_service_index = 0;
+ }
+ else if (multiconnect_strategy == MC_RANDOM)
+ {
+ /*
+ * We need to get random int <= num_service_names; since this is
+ * infrequently-called and just need uniform integer distribution,
+ * we are using system random() instead of one of the more complex
+ * functions available in this file.
+ */
+ service = service_names[ ((unsigned long)random()) % num_service_names ];
+ }
+ pg_log_info("using service: %s", service);
+ setenv("PGSERVICE", service, true);
+ }
+
/*
* Start the connection. Loop until we have a password if requested by
* backend.
@@ -5724,6 +5767,74 @@ set_random_seed(const char *seed)
return true;
}
+static const char**
+availableServiceEntries(const char *serviceFile)
+{
+ int linenr = 0,
+ num_svc = 0,
+ max_svc = 10;
+ FILE *f;
+ char *line, **services;
+ StringInfoData linebuf;
+
+ f = fopen(serviceFile, "r");
+ if (f == NULL)
+ {
+ return NULL;
+ }
+
+ initStringInfo(&linebuf);
+
+ services = (char **)pg_malloc0(max_svc * sizeof(char *));
+
+ while (pg_get_line_buf(f, &linebuf))
+ {
+ linenr++;
+
+ /* ignore whitespace at end of line, especially the newline */
+ while (linebuf.len > 0 &&
+ isspace((unsigned char) linebuf.data[linebuf.len - 1]))
+ linebuf.data[--linebuf.len] = '\0';
+
+ line = linebuf.data;
+
+ /* ignore leading whitespace too */
+ while (*line && isspace((unsigned char) line[0]))
+ line++;
+
+ /* ignore comments and empty lines */
+ if (line[0] == '\0' || line[0] == '#')
+ continue;
+
+ /* Check for groupname section */
+ if (line[0] == '[')
+ {
+ char *endp;
+
+ line++;
+
+ endp = strchr(line, ']');
+ if (endp && (endp - line) > 0) {
+ /* add the literal block to the chunk */
+ services[num_svc] = pnstrdup(line, (endp - line));
+
+ /* possibly expand memory */
+ if (++num_svc >= max_svc) {
+ max_svc += 10;
+ services = pg_realloc(services, max_svc * sizeof(char *));
+ }
+
+ /* null out the next possible entry */
+ services[num_svc] = NULL;
+ }
+ }
+ }
+
+ fclose(f);
+ pfree(linebuf.data);
+ return (const char**)services;
+}
+
int
main(int argc, char **argv)
{
@@ -5742,6 +5853,8 @@ main(int argc, char **argv)
{"jobs", required_argument, NULL, 'j'},
{"log", no_argument, NULL, 'l'},
{"latency-limit", required_argument, NULL, 'L'},
+ {"multiconnect", required_argument, NULL, 'm'},
+ {"multiconnect-strategy", required_argument, NULL, 'g'},
{"no-vacuum", no_argument, NULL, 'n'},
{"port", required_argument, NULL, 'p'},
{"progress", required_argument, NULL, 'P'},
@@ -5835,7 +5948,7 @@ main(int argc, char **argv)
exit(1);
}
- while ((c = getopt_long(argc, argv, "iI:h:nvp:dqb:SNc:j:Crs:t:T:U:lf:D:F:M:P:R:L:", long_options, &optindex)) != -1)
+ while ((c = getopt_long(argc, argv, "iI:h:nvp:dqb:SNc:j:Crs:t:T:U:lf:D:F:M:m:g:P:R:L:", long_options, &optindex)) != -1)
{
char *script;
@@ -6008,6 +6121,55 @@ main(int argc, char **argv)
exit(1);
}
break;
+ case 'm':
+ {
+ char **p;
+
+ service_names = availableServiceEntries(optarg);
+ p = (char**)service_names;
+
+ if (!service_names)
+ {
+ pg_log_fatal("Couldn't find any services in file '%s'", optarg);
+ exit(1);
+ }
+
+ while (*(p++))
+ num_service_names++;
+
+ /*
+ * If we found non-zero services in our file then we can set
+ * the PGSERVICEFILE variable to point to the file we parsed,
+ * otherwise there is no point.
+ */
+
+ if (num_service_names) {
+ setenv("PGSERVICEFILE", optarg, true);
+
+ /*
+ * Warn if number of services exceeds the number of
+ * clients expected.
+ */
+
+ if (num_service_names > nclients)
+ pg_log_warning("Found %d services defined, but -c is set to %d; did you mean to increase -c?",
+ num_service_names,
+ nclients
+ );
+ }
+ }
+ break;
+ case 'g':
+ if (strcmp(optarg, "roundrobin") == 0)
+ multiconnect_strategy = MC_ROUND_ROBIN;
+ else if (strcmp(optarg, "random") == 0)
+ multiconnect_strategy = MC_RANDOM;
+ else
+ {
+ pg_log_fatal("Unrecognized multiconnect strategy: %s", optarg);
+ exit(1);
+ }
+ break;
case 'M':
benchmarking_option_set = true;
for (querymode = 0; querymode < NUM_QUERYMODE; querymode++)
Hello David,
This patch adds the concept of "multiconnect" to pgbench (better
terminology welcome).
Good. I was thinking of adding such capability, possibly for handling
connection errors and reconnecting…
The basic idea here is to allow connections made with pgbench to use
different auth values or connect to multiple databases. We implement
this using a user-provided PGSERVICEFILE and choosing a PGSERVICE from
this based on a number of strategies. (Currently the only supported
strategies are round robin or random.)
I was thinking of providing a allowing a list of conninfo strings with
repeated options, eg --conninfo "foo" --conninfo "bla"…
Your approach using PGSERVICEFILE also make sense!
Maybe it could be simplified, the code base reduced, and provide more
benefits, by mixing both ideas.
In particular, pgbench parses the file but then it will be read also by
libpq, yuk yuk.
Also, I do not like that PGSERVICE is overriden by pgbench, while other
options are passed with the parameters approach in doConnect. It would
make proce sense to add a "service" field to the parameters for
consistency, if this approach was to be pursued.
On reflexion, I'd suggest to use the --conninfo (or some other name)
approach, eg "pgbench --conninfo='service=s1' --conninfo='service=s2'" and
users just have to set PGSERVICEFILE env themselves, which I think is
better than pgbench overriding env variables behind their back.
This allow to have a service file with more connections and just tell
pgbench which ones to use, which is the expected way to use this feature.
This drops file parsing.
I can only see benefit to this simplified approach.
What do you think?
About the patch:
There are warnings about trailing whitespaces when applying the patch, and
there are some tabbing issues in the file.
I would not consume "-g" option unless there is some logical link with the
feature. I'd be okay with "-m" if it is still needed. I would suggest to
use it for the choice strategy?
stringinfo: We already have PQExpBuffer imported, could we use that
instead? Having two set of struct/functions which do the same in the same
source file does not look like a good idea. If we do not parse the file,
nothing is needed, which is a relief.
Attached is my work-in-progress start at adding conninfo, that would need
to be improved with strategies.
--
Fabien.
Attachments:
pgbench-multi-connect-conninfo-1.patchtext/x-diff; name=pgbench-multi-connect-conninfo-1.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0c60077e1f..d1390e83e5 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -840,6 +840,16 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<variablelist>
+ <varlistentry>
+ <term><option>--conninfo=</option><replaceable>conninfo</replaceable></term>
+ <listitem>
+ <para>
+ Add a <replaceable>conninfo</replaceable> connection information string
+ to the pool of possible connection strings.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-h</option> <replaceable>hostname</replaceable></term>
<term><option>--host=</option><replaceable>hostname</replaceable></term>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 4aeccd93af..ad71a568db 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -276,13 +276,36 @@ bool is_connect; /* establish connection for each transaction */
bool report_per_command; /* report per-command latencies */
int main_pid; /* main process id used in log filename */
+char *logfile_prefix = NULL;
+
+/* main connection definition */
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
const char *dbName = NULL;
-char *logfile_prefix = NULL;
const char *progname;
+/* connection info list */
+typedef struct conninfo_t
+{
+ const char *conninfo;
+ int errors;
+} conninfo_t;
+
+#define MAX_CONNINFO 8
+int n_conninfo = 0;
+conninfo_t conninfos[MAX_CONNINFO];
+
+static void
+push_conninfo(const char *ci)
+{
+ // FIXME nicer error
+ Assert(n_conninfo < MAX_CONNINFO);
+ conninfos[n_conninfo].conninfo = ci;
+ conninfos[n_conninfo].errors = 0;
+ n_conninfo++;
+}
+
#define WSEP '@' /* weight separator */
volatile bool timer_exceeded = false; /* flag from signal handler */
@@ -751,6 +774,7 @@ usage(void)
" -U, --username=USERNAME connect as specified database user\n"
" -V, --version output version information, then exit\n"
" -?, --help show this help, then exit\n"
+ " --conninfo=CONNINFO add a database server conninfo\n"
"\n"
"Report bugs to <%s>.\n"
"%s home page: <%s>\n",
@@ -1343,9 +1367,44 @@ tryExecuteStatement(PGconn *con, const char *sql)
PQclear(res);
}
+/* set up a connection to the backend with a provided conninfo */
+static PGconn *
+doConnectCI(bool change)
+{
+ static int nci = 0;
+ PGconn *conn;
+ conninfo_t *ci = &conninfos[nci];
+
+ conn = PQconnectdb(ci->conninfo);
+
+ if (!conn)
+ {
+ ci->errors++;
+ pg_log_error("connection to database \"%s\" failed", ci->conninfo);
+ /* try another one next time */
+ nci = (nci + 1) % n_conninfo;
+ return NULL;
+ }
+
+ if (PQstatus(conn) == CONNECTION_BAD)
+ {
+ ci->errors++;
+ pg_log_error("%s", PQerrorMessage(conn));
+ PQfinish(conn);
+ /* try another one next time */
+ nci = (nci + 1) % n_conninfo;
+ return NULL;
+ }
+
+ if (change)
+ nci = (nci + 1) % n_conninfo;
+
+ return conn;
+}
+
/* set up a connection to the backend */
static PGconn *
-doConnect(void)
+doConnectParams(void)
{
PGconn *conn;
bool new_pass;
@@ -1408,6 +1467,13 @@ doConnect(void)
return conn;
}
+/* create a connection with one of the above methods */
+static PGconn *
+doConnect(bool change)
+{
+ return n_conninfo > 0 ? doConnectCI(change) : doConnectParams();
+}
+
/* qsort comparator for Variable array */
static int
compareVariableNames(const void *v1, const void *v2)
@@ -3172,7 +3238,7 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
{
pg_time_usec_t start = now;
- if ((st->con = doConnect()) == NULL)
+ if ((st->con = doConnect(true)) == NULL)
{
pg_log_error("client %d aborted while establishing connection", st->id);
st->state = CSTATE_ABORTED;
@@ -4407,7 +4473,7 @@ runInitSteps(const char *initialize_steps)
initPQExpBuffer(&stats);
- if ((con = doConnect()) == NULL)
+ if ((con = doConnect(false)) == NULL)
exit(1);
setup_cancel_handler(NULL);
@@ -5769,6 +5835,7 @@ main(int argc, char **argv)
{"show-script", required_argument, NULL, 10},
{"partitions", required_argument, NULL, 11},
{"partition-method", required_argument, NULL, 12},
+ {"conninfo", required_argument, NULL, 13},
{NULL, 0, NULL, 0}
};
@@ -6137,6 +6204,9 @@ main(int argc, char **argv)
exit(1);
}
break;
+ case 13:
+ push_conninfo(pg_strdup(optarg));
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
@@ -6363,7 +6433,7 @@ main(int argc, char **argv)
}
/* opening connection... */
- con = doConnect();
+ con = doConnect(false);
if (con == NULL)
exit(1);
@@ -6606,7 +6676,7 @@ threadRun(void *arg)
/* make connections to the database before starting */
for (int i = 0; i < nstate; i++)
{
- if ((state[i].con = doConnect()) == NULL)
+ if ((state[i].con = doConnect(true)) == NULL)
{
/*
* On connection failure, we meet the barrier here in place of
On Thu, Jul 01, 2021 at 12:22:45PM +0200, Fabien COELHO wrote:
Good. I was thinking of adding such capability, possibly for handling
connection errors and reconnecting…
round-robin and random make sense. I am wondering how round-robin
would work with -C, though? Would you just reuse the same connection
string as the one chosen at the starting point.
I was thinking of providing a allowing a list of conninfo strings with
repeated options, eg --conninfo "foo" --conninfo "bla"…
That was my first thought when reading the subject of this thread:
create a list of connection strings and pass one of them to
doConnect() to grab the properties looked for. That's a bit confusing
though as pgbench does not support directly connection strings, and we
should be careful to keep fallback_application_name intact.
Your approach using PGSERVICEFILE also make sense!
I am not sure that's actually needed here, as it is possible to pass
down a service name within a connection string. I think that you'd
better leave libpq do all the work related to a service file, if
specified. pgbench does not need to know any of that.
--
Michael
Bonjour Michaël,
Good. I was thinking of adding such capability, possibly for handling
connection errors and reconnecting…round-robin and random make sense. I am wondering how round-robin
would work with -C, though? Would you just reuse the same connection
string as the one chosen at the starting point.
Well, not necessarily, but this is debatable.
I was thinking of providing a allowing a list of conninfo strings with
repeated options, eg --conninfo "foo" --conninfo "bla"…That was my first thought when reading the subject of this thread:
create a list of connection strings and pass one of them to
doConnect() to grab the properties looked for. That's a bit confusing
though as pgbench does not support directly connection strings,
They are supported because libpq silently assumes that "dbname" can be a
full connection string.
and we should be careful to keep fallback_application_name intact.
Hmmm. See attached patch, ISTM that it does the right thing.
Your approach using PGSERVICEFILE also make sense!
I am not sure that's actually needed here, as it is possible to pass
down a service name within a connection string. I think that you'd
better leave libpq do all the work related to a service file, if
specified. pgbench does not need to know any of that.
Yes, this is an inconvenient with this approach, part of libpq machinery
is more or less replicated in pgbench, which is quite annoying, and less
powerful.
Attached my work-in-progress version, with a few open issues (eg probably
not thread safe), but comments about the provided feature are welcome.
I borrowed the "strategy" option, renamed policy, from the initial patch.
Pgbench just accepts several connection strings as parameters, eg:
pgbench ... "service=db1" "service=db2" "service=db3"
The next stage is to map scripts to connections types and connections
to connection types, so that pgbench could run W transactions against a
primary and R transactions agains a hot standby, for instance. I have a
some design for that, but nothing is implemented.
There is also the combination with the error handling patch to consider:
if a connection fails, a connection to a replica could be issued instead.
--
Fabien.
Attachments:
pgbench-multi-connect-conninfo-2.patchtext/x-diff; name=pgbench-multi-connect-conninfo-2.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 0c60077e1f..7b99344c90 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -29,7 +29,7 @@ PostgreSQL documentation
<cmdsynopsis>
<command>pgbench</command>
<arg rep="repeat"><replaceable>option</replaceable></arg>
- <arg choice="opt"><replaceable>dbname</replaceable></arg>
+ <arg rep="repeat"><replaceable>dbname or conninfo</replaceable></arg>
</cmdsynopsis>
</refsynopsisdiv>
@@ -160,6 +160,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
not specified, the environment variable
<envar>PGDATABASE</envar> is used. If that is not set, the
user name specified for the connection is used.
+ Alternatively, the <replaceable>dbname</replaceable> can be
+ a standard connection information string.
+ Several connections can be provided.
</para>
</listitem>
</varlistentry>
@@ -840,6 +843,21 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<variablelist>
+ <varlistentry>
+ <term><option>--connection-policy=</option><replaceable>policy</replaceable></term>
+ <listitem>
+ <para>
+ Set the connection policy when multiple connections are available.
+ Default is <literal>round-robin</literal> provided (<literal>ro</literal>).
+ Possible values are:
+ <literal>first</literal> (<literal>f</literal>),
+ <literal>random</literal> (<literal>ra</literal>),
+ <literal>round-robin</literal> (<literal>ro</literal>),
+ <literal>working</literal> (<literal>w</literal>).
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-h</option> <replaceable>hostname</replaceable></term>
<term><option>--host=</option><replaceable>hostname</replaceable></term>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b0e20c46ae..95e58f0573 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -277,13 +277,39 @@ bool is_connect; /* establish connection for each transaction */
bool report_per_command; /* report per-command latencies */
int main_pid; /* main process id used in log filename */
+char *logfile_prefix = NULL;
+
+/* main connection definition */
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
-const char *dbName = NULL;
-char *logfile_prefix = NULL;
const char *progname;
+/* multi connections */
+typedef enum mc_policy_t
+{
+ MC_UNKNOWN = 0,
+ MC_FIRST,
+ MC_RANDOM,
+ MC_ROUND_ROBIN,
+ MC_WORKING
+} mc_policy_t;
+
+/* connection info list */
+typedef struct connection_t
+{
+ const char *connection; /* conninfo or dbname */
+ int errors; /* number of connection errors */
+} connection_t;
+
+static int n_connections = 0;
+static connection_t *connections = NULL;
+static mc_policy_t mc_policy = MC_ROUND_ROBIN;
+
+/* last used connection */
+// FIXME per thread?
+static int current_connection = 0;
+
#define WSEP '@' /* weight separator */
volatile bool timer_exceeded = false; /* flag from signal handler */
@@ -701,7 +727,7 @@ usage(void)
{
printf("%s is a benchmarking tool for PostgreSQL.\n\n"
"Usage:\n"
- " %s [OPTION]... [DBNAME]\n"
+ " %s [OPTION]... [DBNAME or CONNINFO ...]\n"
"\nInitialization options:\n"
" -i, --initialize invokes initialization mode\n"
" -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n"
@@ -756,6 +782,7 @@ usage(void)
" -h, --host=HOSTNAME database server host or socket directory\n"
" -p, --port=PORT database server port number\n"
" -U, --username=USERNAME connect as specified database user\n"
+ " --connection-policy=S set multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n"
" -V, --version output version information, then exit\n"
" -?, --help show this help, then exit\n"
"\n"
@@ -1350,13 +1377,89 @@ tryExecuteStatement(PGconn *con, const char *sql)
PQclear(res);
}
+/* store a new connection information string */
+static void
+push_connection(const char *c)
+{
+ connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1));
+ connections[n_connections].connection = pg_strdup(c);
+ connections[n_connections].errors = 0;
+ n_connections++;
+}
+
+/* switch connection */
+static int
+next_connection(int *pci)
+{
+ int ci;
+
+ ci = ((*pci) + 1) % n_connections;
+ *pci = ci;
+
+ return ci;
+}
+
+/* return the connection index to use for next attempt */
+static int
+choose_connection(int *pci)
+{
+ int ci;
+
+ switch (mc_policy)
+ {
+ case MC_FIRST:
+ ci = 0;
+ break;
+ case MC_RANDOM:
+ // FIXME should use a prng state ; not thread safe ;
+ ci = (int) getrand(&base_random_sequence, 0, n_connections-1);
+ *pci = ci;
+ break;
+ case MC_ROUND_ROBIN:
+ ci = next_connection(pci);
+ break;
+ case MC_WORKING:
+ ci = *pci;
+ break;
+ default:
+ pg_log_fatal("unexpected multi connection policy: %d", mc_policy);
+ exit(1);
+ }
+
+ return ci;
+}
+
+/* return multi-connection policy based on its name or shortest prefix */
+static mc_policy_t
+get_connection_policy(const char *s)
+{
+ if (s == NULL || *s == '\0' || strcmp(s, "first") == 0 || strcmp(s, "f") == 0)
+ return MC_FIRST;
+ else if (strcmp(s, "random") == 0 || strcmp(s, "ra") == 0)
+ return MC_RANDOM;
+ else if (strcmp(s, "round-robin") == 0 || strcmp(s, "ro") == 0)
+ return MC_ROUND_ROBIN;
+ else if (strcmp(s, "working") == 0 || strcmp(s, "w") == 0)
+ return MC_WORKING;
+ else
+ return MC_UNKNOWN;
+}
+
+/* get backend connection info */
+static connection_t *
+getConnection(void)
+{
+ return &connections[choose_connection(¤t_connection)];
+}
+
/* set up a connection to the backend */
static PGconn *
doConnect(void)
{
- PGconn *conn;
- bool new_pass;
- static char *password = NULL;
+ PGconn *conn;
+ bool new_pass;
+ static char *password = NULL;
+ connection_t *ci = getConnection();
/*
* Start the connection. Loop until we have a password if requested by
@@ -1377,8 +1480,9 @@ doConnect(void)
values[2] = username;
keywords[3] = "password";
values[3] = password;
+ /* dbname can include a full conninfo */
keywords[4] = "dbname";
- values[4] = dbName;
+ values[4] = ci->connection;
keywords[5] = "fallback_application_name";
values[5] = progname;
keywords[6] = NULL;
@@ -1386,11 +1490,12 @@ doConnect(void)
new_pass = false;
+ pg_log_debug("connecting to %s", ci->connection);
conn = PQconnectdbParams(keywords, values, true);
if (!conn)
{
- pg_log_error("connection to database \"%s\" failed", dbName);
+ pg_log_error("connection to database \"%s\" failed", ci->connection);
return NULL;
}
@@ -1409,6 +1514,9 @@ doConnect(void)
{
pg_log_error("%s", PQerrorMessage(conn));
PQfinish(conn);
+ ci->errors += 1;
+ if (mc_policy == MC_WORKING)
+ (void) next_connection(¤t_connection);
return NULL;
}
@@ -5787,6 +5895,7 @@ main(int argc, char **argv)
{"show-script", required_argument, NULL, 10},
{"partitions", required_argument, NULL, 11},
{"partition-method", required_argument, NULL, 12},
+ {"connection-policy", required_argument, NULL, 13},
{NULL, 0, NULL, 0}
};
@@ -6140,6 +6249,14 @@ main(int argc, char **argv)
exit(1);
}
break;
+ case 13:
+ mc_policy = get_connection_policy(optarg);
+ if (mc_policy == MC_UNKNOWN)
+ {
+ pg_log_fatal("unexpected connection policy: %s", optarg);
+ exit(1);
+ }
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
@@ -6194,23 +6311,18 @@ main(int argc, char **argv)
throttle_delay *= nthreads;
if (argc > optind)
- dbName = argv[optind++];
+ {
+ while (optind < argc)
+ push_connection(argv[optind++]);
+ }
else
{
if ((env = getenv("PGDATABASE")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else
- dbName = get_user_name_or_exit(progname);
- }
-
- if (optind < argc)
- {
- pg_log_fatal("too many command-line arguments (first is \"%s\")",
- argv[optind]);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
- exit(1);
+ push_connection(get_user_name_or_exit(progname));
}
if (is_init_mode)
Good. I was thinking of adding such capability, possibly for handling
connection errors and reconnecting…round-robin and random make sense. I am wondering how round-robin
would work with -C, though? Would you just reuse the same connection
string as the one chosen at the starting point.Well, not necessarily, but this is debatable.
My expectation for such a behavior would be that it would reconnect to
a random connstring each time, otherwise what's the point of using
this with -C? If we needed to forbid some option combinations that is
also an option.
I was thinking of providing a allowing a list of conninfo strings with
repeated options, eg --conninfo "foo" --conninfo "bla"…That was my first thought when reading the subject of this thread:
create a list of connection strings and pass one of them to
doConnect() to grab the properties looked for. That's a bit confusing
though as pgbench does not support directly connection strings,They are supported because libpq silently assumes that "dbname" can be a
full connection string.and we should be careful to keep fallback_application_name intact.
Hmmm. See attached patch, ISTM that it does the right thing.
I guess the multiple --conninfo approach is fine; I personally liked
having the list come from a file, as you could benchmark different
groups/clusters based on a file, much easier than constructing
multiple pgbench invocations depending. I can see an argument for
both approaches. The PGSERVICEFILE was an idea I'd had to store
easily indexed groups of connection information in a way that I didn't
need to know all the details, could easily parse, and could later pass
in the ENV so libpq could just pull out the information.
Your approach using PGSERVICEFILE also make sense!
I am not sure that's actually needed here, as it is possible to pass
down a service name within a connection string. I think that you'd
better leave libpq do all the work related to a service file, if
specified. pgbench does not need to know any of that.Yes, this is an inconvenient with this approach, part of libpq machinery
is more or less replicated in pgbench, which is quite annoying, and less
powerful.
There is some small fraction reproduced here just to pull out the
named sections; no other parsing should be done though.
Attached my work-in-progress version, with a few open issues (eg probably
not thread safe), but comments about the provided feature are welcome.I borrowed the "strategy" option, renamed policy, from the initial patch.
Pgbench just accepts several connection strings as parameters, eg:pgbench ... "service=db1" "service=db2" "service=db3"
The next stage is to map scripts to connections types and connections
to connection types, so that pgbench could run W transactions against a
primary and R transactions agains a hot standby, for instance. I have a
some design for that, but nothing is implemented.There is also the combination with the error handling patch to consider:
if a connection fails, a connection to a replica could be issued instead.
I'll see if I can take a look at your latest patch. I was also
wondering about how we should handle `pgbench -i` with multiple
connection strings; currently it would only initialize with the first
DSN it gets, but it probably makes sense to run initialize against all
of the databases (or at least attempt to). Maybe this is one argument
for the multiple --conninfo handling, since you could explicitly pass
the databases you want. (Not that it is hard to just loop over
connection info and `pgbench -i` with ENV, or any other number of ways
to accomplish the same thing.)
Best,
David
Hello David,
round-robin and random make sense. I am wondering how round-robin
would work with -C, though? Would you just reuse the same connection
string as the one chosen at the starting point.Well, not necessarily, but this is debatable.
My expectation for such a behavior would be that it would reconnect to
a random connstring each time, otherwise what's the point of using
this with -C? If we needed to forbid some option combinations that is
also an option.
Yep. ISTM that it should follow the connection policy/strategy, what ever
it is.
I was thinking of providing a allowing a list of conninfo strings with
repeated options, eg --conninfo "foo" --conninfo "bla"…That was my first thought when reading the subject of this thread:
create a list of connection strings and pass one of them to
doConnect() to grab the properties looked for. That's a bit confusing
though as pgbench does not support directly connection strings,They are supported because libpq silently assumes that "dbname" can be a
full connection string.and we should be careful to keep fallback_application_name intact.
Hmmm. See attached patch, ISTM that it does the right thing.
I guess the multiple --conninfo approach is fine; I personally liked
having the list come from a file, as you could benchmark different
groups/clusters based on a file, much easier than constructing
multiple pgbench invocations depending. I can see an argument for
both approaches. The PGSERVICEFILE was an idea I'd had to store
easily indexed groups of connection information in a way that I didn't
need to know all the details, could easily parse, and could later pass
in the ENV so libpq could just pull out the information.
The attached version does work with the service file if the user provides
"service=whatever" on the command line. The main difference is that it
sticks to the libpq policy to use an explicit connection string or list of
connection strings.
Also, note that the patch I sent dropped the --conninfo option.
Connections are simply tghe last arguments to pgbench.
I'll see if I can take a look at your latest patch.
Thanks!
I was also wondering about how we should handle `pgbench -i` with
multiple connection strings; currently it would only initialize with the
first DSN it gets, but it probably makes sense to run initialize against
all of the databases (or at least attempt to).
I'll tend to disagree on this one. Pgbench whole expectation is to run
against "one" system, which might be composed of several nodes because of
replications. I do not think that it is desirable to jump to "serveral
fully independent databases".
Maybe this is one argument for the multiple --conninfo handling, since
you could explicitly pass the databases you want. (Not that it is hard
to just loop over connection info and `pgbench -i` with ENV, or any
other number of ways to accomplish the same thing.)
Yep.
--
Fabien.
Hi guys,
It looks like David sent a patch and Fabien sent a followup patch. But
there hasn't been a whole lot of discussion or further patches.
It sounds like there are some basic questions about what the right
interface should be. Are there specific questions that would be
helpful for moving forward?
Hello Greg,
It looks like David sent a patch and Fabien sent a followup patch. But
there hasn't been a whole lot of discussion or further patches.It sounds like there are some basic questions about what the right
interface should be. Are there specific questions that would be
helpful for moving forward?
Review the designs and patches and tell us what you think?
Personnaly, I think that allowing multiple connections is a good thing,
especially if the code impact is reduced, which is the case with the
version I sent.
Then for me the next step would be to have a reconnection on errors so as
to implement a client-side failover policy that could help testing a
server-failover performance impact. I have done that internally but it
requires that "Pgbench Serialization and deadlock errors" to land, as it
would just be another error that can be handled.
--
Fabien.
The current version of the patch does not apply, so I could not test it.
Here are some comments I have.
Pgbench is a simple benchmark tool by design, and I wonder if adding
a multiconnect feature will cause pgbench to be used incorrectly.
A real world use-case will be helpful for this thread.
For the current patch, Should the report also cover per-database statistics (tps/latency/etc.) ?
Regards,
Sami Imseih
Amazon Web Services
Hi Sami,
Pgbench is a simple benchmark tool by design, and I wonder if adding
a multiconnect feature will cause pgbench to be used incorrectly.
Maybe, but I do not see how it would be worse that what pgbench already
allows.
A real world use-case will be helpful for this thread.
Basically more versatile testing for non single host setups.
For instance, it would allow testing directly a multi-master setup, such
as bucardo, symmetricds or coackroachdb.
It would be a first step on the path to allow interesting features such
as:
- testing failover setup, on connection error a client could connect to
another host.
- testing a primary/standby setup, with write transactions sent to the
primary and read transactions sent to the standbyes.
Basically I have no doubt that it can be useful.
For the current patch, Should the report also cover per-database
statistics (tps/latency/etc.) ?
That could be a "per-connection" option. If there is a reasonable use case
I think that it would be an easy enough feature to implement.
Attached a rebased version.
--
Fabien.
Attachments:
pgbench-multi-connect-conninfo-3.patchtext/x-diff; name=pgbench-multi-connect-conninfo-3.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index be1896fa99..69bd5b76f1 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -29,7 +29,7 @@ PostgreSQL documentation
<cmdsynopsis>
<command>pgbench</command>
<arg rep="repeat"><replaceable>option</replaceable></arg>
- <arg choice="opt"><replaceable>dbname</replaceable></arg>
+ <arg rep="repeat"><replaceable>dbname or conninfo</replaceable></arg>
</cmdsynopsis>
</refsynopsisdiv>
@@ -160,6 +160,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
not specified, the environment variable
<envar>PGDATABASE</envar> is used. If that is not set, the
user name specified for the connection is used.
+ Alternatively, the <replaceable>dbname</replaceable> can be
+ a standard connection information string.
+ Several connections can be provided.
</para>
</listitem>
</varlistentry>
@@ -843,6 +846,21 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<variablelist>
+ <varlistentry>
+ <term><option>--connection-policy=</option><replaceable>policy</replaceable></term>
+ <listitem>
+ <para>
+ Set the connection policy when multiple connections are available.
+ Default is <literal>round-robin</literal> provided (<literal>ro</literal>).
+ Possible values are:
+ <literal>first</literal> (<literal>f</literal>),
+ <literal>random</literal> (<literal>ra</literal>),
+ <literal>round-robin</literal> (<literal>ro</literal>),
+ <literal>working</literal> (<literal>w</literal>).
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-h</option> <replaceable>hostname</replaceable></term>
<term><option>--host=</option><replaceable>hostname</replaceable></term>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 000ffc4a5c..5006e21766 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -278,13 +278,39 @@ bool is_connect; /* establish connection for each transaction */
bool report_per_command; /* report per-command latencies */
int main_pid; /* main process id used in log filename */
+char *logfile_prefix = NULL;
+
+/* main connection definition */
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
-const char *dbName = NULL;
-char *logfile_prefix = NULL;
const char *progname;
+/* multi connections */
+typedef enum mc_policy_t
+{
+ MC_UNKNOWN = 0,
+ MC_FIRST,
+ MC_RANDOM,
+ MC_ROUND_ROBIN,
+ MC_WORKING
+} mc_policy_t;
+
+/* connection info list */
+typedef struct connection_t
+{
+ const char *connection; /* conninfo or dbname */
+ int errors; /* number of connection errors */
+} connection_t;
+
+static int n_connections = 0;
+static connection_t *connections = NULL;
+static mc_policy_t mc_policy = MC_ROUND_ROBIN;
+
+/* last used connection */
+// FIXME per thread?
+static int current_connection = 0;
+
#define WSEP '@' /* weight separator */
volatile bool timer_exceeded = false; /* flag from signal handler */
@@ -694,7 +720,7 @@ usage(void)
{
printf("%s is a benchmarking tool for PostgreSQL.\n\n"
"Usage:\n"
- " %s [OPTION]... [DBNAME]\n"
+ " %s [OPTION]... [DBNAME or CONNINFO ...]\n"
"\nInitialization options:\n"
" -i, --initialize invokes initialization mode\n"
" -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n"
@@ -749,6 +775,7 @@ usage(void)
" -h, --host=HOSTNAME database server host or socket directory\n"
" -p, --port=PORT database server port number\n"
" -U, --username=USERNAME connect as specified database user\n"
+ " --connection-policy=S set multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n"
" -V, --version output version information, then exit\n"
" -?, --help show this help, then exit\n"
"\n"
@@ -1323,13 +1350,89 @@ tryExecuteStatement(PGconn *con, const char *sql)
PQclear(res);
}
+/* store a new connection information string */
+static void
+push_connection(const char *c)
+{
+ connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1));
+ connections[n_connections].connection = pg_strdup(c);
+ connections[n_connections].errors = 0;
+ n_connections++;
+}
+
+/* switch connection */
+static int
+next_connection(int *pci)
+{
+ int ci;
+
+ ci = ((*pci) + 1) % n_connections;
+ *pci = ci;
+
+ return ci;
+}
+
+/* return the connection index to use for next attempt */
+static int
+choose_connection(int *pci)
+{
+ int ci;
+
+ switch (mc_policy)
+ {
+ case MC_FIRST:
+ ci = 0;
+ break;
+ case MC_RANDOM:
+ // FIXME should use a prng state ; not thread safe ;
+ ci = (int) getrand(&base_random_sequence, 0, n_connections-1);
+ *pci = ci;
+ break;
+ case MC_ROUND_ROBIN:
+ ci = next_connection(pci);
+ break;
+ case MC_WORKING:
+ ci = *pci;
+ break;
+ default:
+ pg_log_fatal("unexpected multi connection policy: %d", mc_policy);
+ exit(1);
+ }
+
+ return ci;
+}
+
+/* return multi-connection policy based on its name or shortest prefix */
+static mc_policy_t
+get_connection_policy(const char *s)
+{
+ if (s == NULL || *s == '\0' || strcmp(s, "first") == 0 || strcmp(s, "f") == 0)
+ return MC_FIRST;
+ else if (strcmp(s, "random") == 0 || strcmp(s, "ra") == 0)
+ return MC_RANDOM;
+ else if (strcmp(s, "round-robin") == 0 || strcmp(s, "ro") == 0)
+ return MC_ROUND_ROBIN;
+ else if (strcmp(s, "working") == 0 || strcmp(s, "w") == 0)
+ return MC_WORKING;
+ else
+ return MC_UNKNOWN;
+}
+
+/* get backend connection info */
+static connection_t *
+getConnection(void)
+{
+ return &connections[choose_connection(¤t_connection)];
+}
+
/* set up a connection to the backend */
static PGconn *
doConnect(void)
{
- PGconn *conn;
- bool new_pass;
- static char *password = NULL;
+ PGconn *conn;
+ bool new_pass;
+ static char *password = NULL;
+ connection_t *ci = getConnection();
/*
* Start the connection. Loop until we have a password if requested by
@@ -1350,8 +1453,9 @@ doConnect(void)
values[2] = username;
keywords[3] = "password";
values[3] = password;
+ /* dbname can include a full conninfo */
keywords[4] = "dbname";
- values[4] = dbName;
+ values[4] = ci->connection;
keywords[5] = "fallback_application_name";
values[5] = progname;
keywords[6] = NULL;
@@ -1359,11 +1463,12 @@ doConnect(void)
new_pass = false;
+ pg_log_debug("connecting to %s", ci->connection);
conn = PQconnectdbParams(keywords, values, true);
if (!conn)
{
- pg_log_error("connection to database \"%s\" failed", dbName);
+ pg_log_error("connection to database \"%s\" failed", ci->connection);
return NULL;
}
@@ -1382,6 +1487,9 @@ doConnect(void)
{
pg_log_error("%s", PQerrorMessage(conn));
PQfinish(conn);
+ ci->errors += 1;
+ if (mc_policy == MC_WORKING)
+ (void) next_connection(¤t_connection);
return NULL;
}
@@ -5797,6 +5905,7 @@ main(int argc, char **argv)
{"show-script", required_argument, NULL, 10},
{"partitions", required_argument, NULL, 11},
{"partition-method", required_argument, NULL, 12},
+ {"connection-policy", required_argument, NULL, 13},
{NULL, 0, NULL, 0}
};
@@ -6150,6 +6259,14 @@ main(int argc, char **argv)
exit(1);
}
break;
+ case 13:
+ mc_policy = get_connection_policy(optarg);
+ if (mc_policy == MC_UNKNOWN)
+ {
+ pg_log_fatal("unexpected connection policy: %s", optarg);
+ exit(1);
+ }
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
@@ -6204,23 +6321,18 @@ main(int argc, char **argv)
throttle_delay *= nthreads;
if (argc > optind)
- dbName = argv[optind++];
+ {
+ while (optind < argc)
+ push_connection(argv[optind++]);
+ }
else
{
if ((env = getenv("PGDATABASE")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else
- dbName = get_user_name_or_exit(progname);
- }
-
- if (optind < argc)
- {
- pg_log_fatal("too many command-line arguments (first is \"%s\")",
- argv[optind]);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
- exit(1);
+ push_connection(get_user_name_or_exit(progname));
}
if (is_init_mode)
On Sat, Mar 19, 2022 at 11:43 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hi Sami,
Pgbench is a simple benchmark tool by design, and I wonder if adding
a multiconnect feature will cause pgbench to be used incorrectly.Maybe, but I do not see how it would be worse that what pgbench already
allows.
I agree that pgbench is simple; perhaps really too simple when it comes to
being able to measure much more than basic query flows. What pgbench does
have in its favor is being distributed with the core distribution.
I think there is definitely space for a more complicated benchmarking tool
that exercises more scenarios and more realistic query patterns and
scenarios. Whether that is distributed with the core is another question.
David
Pgbench is a simple benchmark tool by design, and I wonder if adding
a multiconnect feature will cause pgbench to be used incorrectly.Maybe, but I do not see how it would be worse that what pgbench already
allows.I agree that pgbench is simple; perhaps really too simple when it comes to
being able to measure much more than basic query flows. What pgbench does
have in its favor is being distributed with the core distribution.I think there is definitely space for a more complicated benchmarking tool
that exercises more scenarios and more realistic query patterns and
scenarios. Whether that is distributed with the core is another question.
As far as this feature is concerned, the source code impact of the patch
is very small, so I do not think that is worth barring this feature on
that ground.
--
Fabien.
According to the cfbot this patch needs a rebase
Indeed. v4 attached.
--
Fabien.
Attachments:
pgbench-multi-connect-conninfo-4.patchtext/x-diff; name=pgbench-multi-connect-conninfo-4.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index ebdb4b3f46..d96d2d291d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -29,7 +29,7 @@ PostgreSQL documentation
<cmdsynopsis>
<command>pgbench</command>
<arg rep="repeat"><replaceable>option</replaceable></arg>
- <arg choice="opt"><replaceable>dbname</replaceable></arg>
+ <arg rep="repeat"><replaceable>dbname or conninfo</replaceable></arg>
</cmdsynopsis>
</refsynopsisdiv>
@@ -169,6 +169,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
not specified, the environment variable
<envar>PGDATABASE</envar> is used. If that is not set, the
user name specified for the connection is used.
+ Alternatively, the <replaceable>dbname</replaceable> can be
+ a standard connection information string.
+ Several connections can be provided.
</para>
</listitem>
</varlistentry>
@@ -918,6 +921,21 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<variablelist>
+ <varlistentry>
+ <term><option>--connection-policy=</option><replaceable>policy</replaceable></term>
+ <listitem>
+ <para>
+ Set the connection policy when multiple connections are available.
+ Default is <literal>round-robin</literal> provided (<literal>ro</literal>).
+ Possible values are:
+ <literal>first</literal> (<literal>f</literal>),
+ <literal>random</literal> (<literal>ra</literal>),
+ <literal>round-robin</literal> (<literal>ro</literal>),
+ <literal>working</literal> (<literal>w</literal>).
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-h</option> <replaceable>hostname</replaceable></term>
<term><option>--host=</option><replaceable>hostname</replaceable></term>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index acf3e56413..d99d40fbb9 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -305,13 +305,39 @@ uint32 max_tries = 1;
bool failures_detailed = false; /* whether to group failures in reports
* or logs by basic types */
+char *logfile_prefix = NULL;
+
+/* main connection definition */
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
-const char *dbName = NULL;
-char *logfile_prefix = NULL;
const char *progname;
+/* multi connections */
+typedef enum mc_policy_t
+{
+ MC_UNKNOWN = 0,
+ MC_FIRST,
+ MC_RANDOM,
+ MC_ROUND_ROBIN,
+ MC_WORKING
+} mc_policy_t;
+
+/* connection info list */
+typedef struct connection_t
+{
+ const char *connection; /* conninfo or dbname */
+ int errors; /* number of connection errors */
+} connection_t;
+
+static int n_connections = 0;
+static connection_t *connections = NULL;
+static mc_policy_t mc_policy = MC_ROUND_ROBIN;
+
+/* last used connection */
+// FIXME per thread?
+static int current_connection = 0;
+
#define WSEP '@' /* weight separator */
volatile bool timer_exceeded = false; /* flag from signal handler */
@@ -873,7 +899,7 @@ usage(void)
{
printf("%s is a benchmarking tool for PostgreSQL.\n\n"
"Usage:\n"
- " %s [OPTION]... [DBNAME]\n"
+ " %s [OPTION]... [DBNAME or CONNINFO ...]\n"
"\nInitialization options:\n"
" -i, --initialize invokes initialization mode\n"
" -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n"
@@ -931,6 +957,7 @@ usage(void)
" -h, --host=HOSTNAME database server host or socket directory\n"
" -p, --port=PORT database server port number\n"
" -U, --username=USERNAME connect as specified database user\n"
+ " --connection-policy=S set multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n"
" -V, --version output version information, then exit\n"
" -?, --help show this help, then exit\n"
"\n"
@@ -1538,13 +1565,89 @@ tryExecuteStatement(PGconn *con, const char *sql)
PQclear(res);
}
+/* store a new connection information string */
+static void
+push_connection(const char *c)
+{
+ connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1));
+ connections[n_connections].connection = pg_strdup(c);
+ connections[n_connections].errors = 0;
+ n_connections++;
+}
+
+/* switch connection */
+static int
+next_connection(int *pci)
+{
+ int ci;
+
+ ci = ((*pci) + 1) % n_connections;
+ *pci = ci;
+
+ return ci;
+}
+
+/* return the connection index to use for next attempt */
+static int
+choose_connection(int *pci)
+{
+ int ci;
+
+ switch (mc_policy)
+ {
+ case MC_FIRST:
+ ci = 0;
+ break;
+ case MC_RANDOM:
+ // FIXME should use a prng state ; not thread safe ;
+ ci = (int) getrand(&base_random_sequence, 0, n_connections-1);
+ *pci = ci;
+ break;
+ case MC_ROUND_ROBIN:
+ ci = next_connection(pci);
+ break;
+ case MC_WORKING:
+ ci = *pci;
+ break;
+ default:
+ pg_log_fatal("unexpected multi connection policy: %d", mc_policy);
+ exit(1);
+ }
+
+ return ci;
+}
+
+/* return multi-connection policy based on its name or shortest prefix */
+static mc_policy_t
+get_connection_policy(const char *s)
+{
+ if (s == NULL || *s == '\0' || strcmp(s, "first") == 0 || strcmp(s, "f") == 0)
+ return MC_FIRST;
+ else if (strcmp(s, "random") == 0 || strcmp(s, "ra") == 0)
+ return MC_RANDOM;
+ else if (strcmp(s, "round-robin") == 0 || strcmp(s, "ro") == 0)
+ return MC_ROUND_ROBIN;
+ else if (strcmp(s, "working") == 0 || strcmp(s, "w") == 0)
+ return MC_WORKING;
+ else
+ return MC_UNKNOWN;
+}
+
+/* get backend connection info */
+static connection_t *
+getConnection(void)
+{
+ return &connections[choose_connection(¤t_connection)];
+}
+
/* set up a connection to the backend */
static PGconn *
doConnect(void)
{
- PGconn *conn;
- bool new_pass;
- static char *password = NULL;
+ PGconn *conn;
+ bool new_pass;
+ static char *password = NULL;
+ connection_t *ci = getConnection();
/*
* Start the connection. Loop until we have a password if requested by
@@ -1565,8 +1668,9 @@ doConnect(void)
values[2] = username;
keywords[3] = "password";
values[3] = password;
+ /* dbname can include a full conninfo */
keywords[4] = "dbname";
- values[4] = dbName;
+ values[4] = ci->connection;
keywords[5] = "fallback_application_name";
values[5] = progname;
keywords[6] = NULL;
@@ -1574,11 +1678,12 @@ doConnect(void)
new_pass = false;
+ pg_log_debug("connecting to %s", ci->connection);
conn = PQconnectdbParams(keywords, values, true);
if (!conn)
{
- pg_log_error("connection to database \"%s\" failed", dbName);
+ pg_log_error("connection to database \"%s\" failed", ci->connection);
return NULL;
}
@@ -1597,6 +1702,9 @@ doConnect(void)
{
pg_log_error("%s", PQerrorMessage(conn));
PQfinish(conn);
+ ci->errors += 1;
+ if (mc_policy == MC_WORKING)
+ (void) next_connection(¤t_connection);
return NULL;
}
@@ -6570,6 +6678,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"connection-policy", required_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6945,6 +7054,14 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16:
+ mc_policy = get_connection_policy(optarg);
+ if (mc_policy == MC_UNKNOWN)
+ {
+ pg_log_fatal("unexpected connection policy: %s", optarg);
+ exit(1);
+ }
+ break;
default:
fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
exit(1);
@@ -6999,23 +7116,18 @@ main(int argc, char **argv)
throttle_delay *= nthreads;
if (argc > optind)
- dbName = argv[optind++];
+ {
+ while (optind < argc)
+ push_connection(argv[optind++]);
+ }
else
{
if ((env = getenv("PGDATABASE")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else
- dbName = get_user_name_or_exit(progname);
- }
-
- if (optind < argc)
- {
- pg_log_fatal("too many command-line arguments (first is \"%s\")",
- argv[optind]);
- fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
- exit(1);
+ push_connection(get_user_name_or_exit(progname));
}
if (is_init_mode)
2022年4月2日(土) 22:35 Fabien COELHO <coelho@cri.ensmp.fr>:
According to the cfbot this patch needs a rebase
Indeed. v4 attached.
Hi
cfbot reports the patch no longer applies. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.
Thanks
Ian Barwick
Hello Ian,
cfbot reports the patch no longer applies. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.
Attached a v5 which is just a rebase.
--
Fabien.
Attachments:
pgbench-multi-connect-conninfo-5.patchtext/x-diff; name=pgbench-multi-connect-conninfo-5.patchDownload
diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 40e6a50a7f..a3ae7cc9be 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -29,7 +29,7 @@ PostgreSQL documentation
<cmdsynopsis>
<command>pgbench</command>
<arg rep="repeat"><replaceable>option</replaceable></arg>
- <arg choice="opt"><replaceable>dbname</replaceable></arg>
+ <arg rep="repeat"><replaceable>dbname or conninfo</replaceable></arg>
</cmdsynopsis>
</refsynopsisdiv>
@@ -169,6 +169,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
not specified, the environment variable
<envar>PGDATABASE</envar> is used. If that is not set, the
user name specified for the connection is used.
+ Alternatively, the <replaceable>dbname</replaceable> can be
+ a standard connection information string.
+ Several connections can be provided.
</para>
</listitem>
</varlistentry>
@@ -918,6 +921,21 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
<variablelist>
+ <varlistentry>
+ <term><option>--connection-policy=</option><replaceable>policy</replaceable></term>
+ <listitem>
+ <para>
+ Set the connection policy when multiple connections are available.
+ Default is <literal>round-robin</literal> provided (<literal>ro</literal>).
+ Possible values are:
+ <literal>first</literal> (<literal>f</literal>),
+ <literal>random</literal> (<literal>ra</literal>),
+ <literal>round-robin</literal> (<literal>ro</literal>),
+ <literal>working</literal> (<literal>w</literal>).
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry>
<term><option>-h</option> <replaceable>hostname</replaceable></term>
<term><option>--host=</option><replaceable>hostname</replaceable></term>
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index b208d74767..02f8278b34 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -301,13 +301,39 @@ uint32 max_tries = 1;
bool failures_detailed = false; /* whether to group failures in
* reports or logs by basic types */
+char *logfile_prefix = NULL;
+
+/* main connection definition */
const char *pghost = NULL;
const char *pgport = NULL;
const char *username = NULL;
-const char *dbName = NULL;
-char *logfile_prefix = NULL;
const char *progname;
+/* multi connections */
+typedef enum mc_policy_t
+{
+ MC_UNKNOWN = 0,
+ MC_FIRST,
+ MC_RANDOM,
+ MC_ROUND_ROBIN,
+ MC_WORKING
+} mc_policy_t;
+
+/* connection info list */
+typedef struct connection_t
+{
+ const char *connection; /* conninfo or dbname */
+ int errors; /* number of connection errors */
+} connection_t;
+
+static int n_connections = 0;
+static connection_t *connections = NULL;
+static mc_policy_t mc_policy = MC_ROUND_ROBIN;
+
+/* last used connection */
+// FIXME per thread?
+static int current_connection = 0;
+
#define WSEP '@' /* weight separator */
volatile bool timer_exceeded = false; /* flag from signal handler */
@@ -871,7 +897,7 @@ usage(void)
{
printf("%s is a benchmarking tool for PostgreSQL.\n\n"
"Usage:\n"
- " %s [OPTION]... [DBNAME]\n"
+ " %s [OPTION]... [DBNAME or CONNINFO ...]\n"
"\nInitialization options:\n"
" -i, --initialize invokes initialization mode\n"
" -I, --init-steps=[" ALL_INIT_STEPS "]+ (default \"" DEFAULT_INIT_STEPS "\")\n"
@@ -929,6 +955,7 @@ usage(void)
" -h, --host=HOSTNAME database server host or socket directory\n"
" -p, --port=PORT database server port number\n"
" -U, --username=USERNAME connect as specified database user\n"
+ " --connection-policy=S set multiple connection policy (\"first\", \"rand\", \"round-robin\", \"working\")\n"
" -V, --version output version information, then exit\n"
" -?, --help show this help, then exit\n"
"\n"
@@ -1535,13 +1562,89 @@ tryExecuteStatement(PGconn *con, const char *sql)
PQclear(res);
}
+/* store a new connection information string */
+static void
+push_connection(const char *c)
+{
+ connections = pg_realloc(connections, sizeof(connection_t) * (n_connections+1));
+ connections[n_connections].connection = pg_strdup(c);
+ connections[n_connections].errors = 0;
+ n_connections++;
+}
+
+/* switch connection */
+static int
+next_connection(int *pci)
+{
+ int ci;
+
+ ci = ((*pci) + 1) % n_connections;
+ *pci = ci;
+
+ return ci;
+}
+
+/* return the connection index to use for next attempt */
+static int
+choose_connection(int *pci)
+{
+ int ci;
+
+ switch (mc_policy)
+ {
+ case MC_FIRST:
+ ci = 0;
+ break;
+ case MC_RANDOM:
+ // FIXME should use a prng state ; not thread safe ;
+ ci = (int) getrand(&base_random_sequence, 0, n_connections-1);
+ *pci = ci;
+ break;
+ case MC_ROUND_ROBIN:
+ ci = next_connection(pci);
+ break;
+ case MC_WORKING:
+ ci = *pci;
+ break;
+ default:
+ pg_fatal("unexpected multi connection policy: %d", mc_policy);
+ exit(1);
+ }
+
+ return ci;
+}
+
+/* return multi-connection policy based on its name or shortest prefix */
+static mc_policy_t
+get_connection_policy(const char *s)
+{
+ if (s == NULL || *s == '\0' || strcmp(s, "first") == 0 || strcmp(s, "f") == 0)
+ return MC_FIRST;
+ else if (strcmp(s, "random") == 0 || strcmp(s, "ra") == 0)
+ return MC_RANDOM;
+ else if (strcmp(s, "round-robin") == 0 || strcmp(s, "ro") == 0)
+ return MC_ROUND_ROBIN;
+ else if (strcmp(s, "working") == 0 || strcmp(s, "w") == 0)
+ return MC_WORKING;
+ else
+ return MC_UNKNOWN;
+}
+
+/* get backend connection info */
+static connection_t *
+getConnection(void)
+{
+ return &connections[choose_connection(¤t_connection)];
+}
+
/* set up a connection to the backend */
static PGconn *
doConnect(void)
{
- PGconn *conn;
- bool new_pass;
- static char *password = NULL;
+ PGconn *conn;
+ bool new_pass;
+ static char *password = NULL;
+ connection_t *ci = getConnection();
/*
* Start the connection. Loop until we have a password if requested by
@@ -1562,8 +1665,9 @@ doConnect(void)
values[2] = username;
keywords[3] = "password";
values[3] = password;
+ /* dbname can include a full conninfo */
keywords[4] = "dbname";
- values[4] = dbName;
+ values[4] = ci->connection;
keywords[5] = "fallback_application_name";
values[5] = progname;
keywords[6] = NULL;
@@ -1571,11 +1675,12 @@ doConnect(void)
new_pass = false;
+ pg_log_debug("connecting to %s", ci->connection);
conn = PQconnectdbParams(keywords, values, true);
if (!conn)
{
- pg_log_error("connection to database \"%s\" failed", dbName);
+ pg_log_error("connection to database \"%s\" failed", ci->connection);
return NULL;
}
@@ -1594,6 +1699,9 @@ doConnect(void)
{
pg_log_error("%s", PQerrorMessage(conn));
PQfinish(conn);
+ ci->errors += 1;
+ if (mc_policy == MC_WORKING)
+ (void) next_connection(¤t_connection);
return NULL;
}
@@ -6544,6 +6652,7 @@ main(int argc, char **argv)
{"failures-detailed", no_argument, NULL, 13},
{"max-tries", required_argument, NULL, 14},
{"verbose-errors", no_argument, NULL, 15},
+ {"connection-policy", required_argument, NULL, 16},
{NULL, 0, NULL, 0}
};
@@ -6881,6 +6990,14 @@ main(int argc, char **argv)
benchmarking_option_set = true;
verbose_errors = true;
break;
+ case 16:
+ mc_policy = get_connection_policy(optarg);
+ if (mc_policy == MC_UNKNOWN)
+ {
+ pg_fatal("unexpected connection policy: %s", optarg);
+ exit(1);
+ }
+ break;
default:
/* getopt_long already emitted a complaint */
pg_log_error_hint("Try \"%s --help\" for more information.", progname);
@@ -6932,23 +7049,18 @@ main(int argc, char **argv)
throttle_delay *= nthreads;
if (argc > optind)
- dbName = argv[optind++];
+ {
+ while (optind < argc)
+ push_connection(argv[optind++]);
+ }
else
{
if ((env = getenv("PGDATABASE")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else if ((env = getenv("PGUSER")) != NULL && *env != '\0')
- dbName = env;
+ push_connection(env);
else
- dbName = get_user_name_or_exit(progname);
- }
-
- if (optind < argc)
- {
- pg_log_error("too many command-line arguments (first is \"%s\")",
- argv[optind]);
- pg_log_error_hint("Try \"%s --help\" for more information.", progname);
- exit(1);
+ push_connection(get_user_name_or_exit(progname));
}
if (is_init_mode)
This patch seems to have quite some use case overlap with my patch which
adds load balancing to libpq itself:
/messages/by-id/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com
My patch is only able to add "random" load balancing though, not
"round-robin". So this patch still definitely seems useful, even when mine
gets merged.
I'm not sure that the support for the "working" connection is necessary
from a feature perspective though (usability/discoverability is another
question). It's already possible to achieve the same behaviour by simply
providing multiple host names in the connection string. You can even tell
libpq to connect to a primary or secondary by using the
target_session_attrs option.
On Fri, 6 Jan 2023 at 11:33, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Show quoted text
Hello Ian,
cfbot reports the patch no longer applies. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.Attached a v5 which is just a rebase.
--
Fabien.
Hello Jelte,
This patch seems to have quite some use case overlap with my patch which
adds load balancing to libpq itself:
/messages/by-id/PR3PR83MB04768E2FF04818EEB2179949F7A69@PR3PR83MB0476.EURPRD83.prod.outlook.com
Thanks for the pointer.
The end purpose of the patch is to allow pgbench to follow a failover at
some point, at the client level, AFAICR.
My patch is only able to add "random" load balancing though, not
"round-robin". So this patch still definitely seems useful, even when mine
gets merged.
Yep. I'm not sure the end purpose is the same, but possibly the pgbench
patch could take advantage of libpq extension.
I'm not sure that the support for the "working" connection is necessary
from a feature perspective though (usability/discoverability is another
question). It's already possible to achieve the same behaviour by simply
providing multiple host names in the connection string. You can even tell
libpq to connect to a primary or secondary by using the
target_session_attrs option.
--
Fabien.
On Tue, 8 Nov 2022 at 02:16, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Ian,
cfbot reports the patch no longer applies. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.Attached a v5 which is just a rebase.
The patch does not apply on top of HEAD as in [1]http://cfbot.cputube.org/patch_41_3227.log, please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
3c6fc58209f24b959ee18f5d19ef96403d08f15c ===
=== applying patch ./pgbench-multi-connect-conninfo-5.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file doc/src/sgml/ref/pgbench.sgml
Hunk #3 FAILED at 921.
1 out of 3 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/pgbench.sgml.rej
[1]: http://cfbot.cputube.org/patch_41_3227.log
Regards,
Vignesh
On Wed, 11 Jan 2023 at 22:17, vignesh C <vignesh21@gmail.com> wrote:
On Tue, 8 Nov 2022 at 02:16, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Hello Ian,
cfbot reports the patch no longer applies. As CommitFest 2022-11 is
currently underway, this would be an excellent time to update the patch.Attached a v5 which is just a rebase.
The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
3c6fc58209f24b959ee18f5d19ef96403d08f15c ===
=== applying patch ./pgbench-multi-connect-conninfo-5.patch
(Stripping trailing CRs from patch; use --binary to disable.)
patching file doc/src/sgml/ref/pgbench.sgml
Hunk #3 FAILED at 921.
1 out of 3 hunks FAILED -- saving rejects to file
doc/src/sgml/ref/pgbench.sgml.rej
There has been no updates on this thread for some time, so this has
been switched as Returned with Feedback. Feel free to change it open
in the next commitfest if you plan to continue on this.
Regards,
Vignesh