pgbench: using prepared BEGIN statement in a pipeline could cause an error

Started by Yugo NAGATAover 4 years ago26 messages
#1Yugo NAGATA
nagata@sraoss.co.jp
1 attachment(s)

Hello,

I found that using "BEGIN ISOLATINO LEVEL SERIALIZABLE" in a pipline with
prepared statement makes pgbench abort.

$ cat pipeline.sql
\startpipeline
begin isolation level repeatable read;
select 1;
end;
\endpipeline

$ pgbench -f pipeline.sql -M prepared -t 1
pgbench (15devel)
starting vacuum...end.
pgbench: error: client 0 script 0 aborted in command 4 query 0:
transaction type: pipeline.sql
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
number of transactions per client: 1
number of transactions actually processed: 0/1
pgbench: fatal: Run was aborted; the above results are incomplete.

The error that occured in the backend was
"ERROR: SET TRANSACTION ISOLATION LEVEL must be called before any query".

After investigating this, now I've got the cause as below.

1. The commands in the script are executed in the order. First, pipeline
mode starts at \startpipeline.
2. Parse messages for all SQL commands in the script are sent to the backend
because it is first time to execute them.
3. An implicit transaction starts, and this is not committed yet because Sync
message is not sent at that time in pipeline mode.
4. All prepared statements are sent to the backend.
5. After processing \endpipeline, Sync is issued and all sent commands are
executed.
6. However, the BEGIN doesn't start new transaction because the implicit
transaction has already started. The error above occurs because the snapshot
was already created before the BEGIN command.

We can also see the similar error when using "BEGIN DEFERRABLE".

One way to avoid these errors is to send Parse messages before pipeline mode
starts. I attached a patch to fix to prepare commands at starting of a script
instead of at the first execution of the command.

Or, we can also avoid these errors by placing \startpipeline after the BEGIN,
so it might be enogh just to note in the documentation.

Actually, we also get an error just when there is another SQL command before the
BEGIN in a pipelne, as below, regardless to using prepared statement or not,
because this command cause an implicit transaction.

\startpipeline
select 0;
begin isolation level repeatable read;
select 1;
end;
\endpipeline

I think it is hard to prevent this error from pgbench without analysing command
strings. Therefore, noting in the documentation that the first command in a pipeline
starts an implicit transaction might be useful for users.

What do you think?

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

Attachments:

pgbench-prepare.patchtext/x-diff; name=pgbench-prepare.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 364b5a2e47..56e790fa33 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2858,6 +2858,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+						commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -2891,42 +2915,6 @@ sendCommand(CState *st, Command *command)
 		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-				char		name[MAX_PREPARE_NAME];
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
 		getQueryParams(st, command, params);
 		preparedStatementName(name, st->use_file, st->command);
 
@@ -3194,6 +3182,11 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					memset(st->prepared, 0, sizeof(st->prepared));
 				}
 
+
+				/* Prepare SQL commands if not yet */
+				if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+					prepareCommands(st);
+
 				/* record transaction start time */
 				st->txn_begin = now;
 
#2Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Yugo NAGATA (#1)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hello Yugo-san,

[...] One way to avoid these errors is to send Parse messages before
pipeline mode starts. I attached a patch to fix to prepare commands at
starting of a script instead of at the first execution of the command.

What do you think?

ISTM that moving prepare out of command execution is a good idea, so I'm
in favor of this approach: the code is simpler and cleaner.

ISTM that a minor impact is that the preparation is not counted in the
command performance statistics. I do not think that it is a problem, even
if it would change detailed results under -C -r -M prepared.

Patch applies & compiles cleanly, global & local make check ok. However
the issue is not tested. I think that the patch should add a tap test case
for the problem being addressed.

I'd suggest to move the statement preparation call in the
CSTATE_CHOOSE_SCRIPT case.

In comments: not yet -> needed.

--
Fabien.

#3Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Fabien COELHO (#2)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hello Fabien,

On Sat, 17 Jul 2021 07:03:01 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Yugo-san,

[...] One way to avoid these errors is to send Parse messages before
pipeline mode starts. I attached a patch to fix to prepare commands at
starting of a script instead of at the first execution of the command.

What do you think?

ISTM that moving prepare out of command execution is a good idea, so I'm
in favor of this approach: the code is simpler and cleaner.

ISTM that a minor impact is that the preparation is not counted in the
command performance statistics. I do not think that it is a problem, even
if it would change detailed results under -C -r -M prepared.

I agree with you. Currently, whether prepares are sent in pipeline mode or
not depends on whether the first SQL command is placed between \startpipeline
and \endpipeline regardless whether other commands are executed in pipeline
or not. ISTM, this behavior would be not intuitive for users. Therefore,
I think preparing all statements not using pipeline mode is not problem for now.

If some users would like to send prepares in pipeline, I think it would be
better to provide more simple and direct way. For example, we prepare statements
in pipeline if the user use an option, or if the script has at least one
\startpipeline in their pipeline. Maybe, --pipeline option is useful for users
who want to use pipeline mode for all queries in scirpts including built-in ones.
However, these features seems to be out of the patch proposed in this thread.

Patch applies & compiles cleanly, global & local make check ok. However
the issue is not tested. I think that the patch should add a tap test case
for the problem being addressed.

Ok. I'll add a tap test to confirm the error I found is avoidable.

I'd suggest to move the statement preparation call in the
CSTATE_CHOOSE_SCRIPT case.

I thought so at first, but I noticed we cannot do it at least if we are
using -C because the connection may not be established in the
CSTATE_CHOOSE_SCRIPT state.

In comments: not yet -> needed.

Thanks. I'll fix it.

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

#4Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Yugo NAGATA (#3)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On Mon, 19 Jul 2021 10:51:36 +0900
Yugo NAGATA <nagata@sraoss.co.jp> wrote:

Hello Fabien,

On Sat, 17 Jul 2021 07:03:01 +0200 (CEST)
Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Yugo-san,

[...] One way to avoid these errors is to send Parse messages before
pipeline mode starts. I attached a patch to fix to prepare commands at
starting of a script instead of at the first execution of the command.

What do you think?

ISTM that moving prepare out of command execution is a good idea, so I'm
in favor of this approach: the code is simpler and cleaner.

ISTM that a minor impact is that the preparation is not counted in the
command performance statistics. I do not think that it is a problem, even
if it would change detailed results under -C -r -M prepared.

I agree with you. Currently, whether prepares are sent in pipeline mode or
not depends on whether the first SQL command is placed between \startpipeline
and \endpipeline regardless whether other commands are executed in pipeline
or not. ISTM, this behavior would be not intuitive for users. Therefore,
I think preparing all statements not using pipeline mode is not problem for now.

If some users would like to send prepares in pipeline, I think it would be
better to provide more simple and direct way. For example, we prepare statements
in pipeline if the user use an option, or if the script has at least one
\startpipeline in their pipeline. Maybe, --pipeline option is useful for users
who want to use pipeline mode for all queries in scirpts including built-in ones.
However, these features seems to be out of the patch proposed in this thread.

Patch applies & compiles cleanly, global & local make check ok. However
the issue is not tested. I think that the patch should add a tap test case
for the problem being addressed.

Ok. I'll add a tap test to confirm the error I found is avoidable.

I'd suggest to move the statement preparation call in the
CSTATE_CHOOSE_SCRIPT case.

I thought so at first, but I noticed we cannot do it at least if we are
using -C because the connection may not be established in the
CSTATE_CHOOSE_SCRIPT state.

In comments: not yet -> needed.

Thanks. I'll fix it.

I attached the updated patch v2, which includes a comment fix and a TAP test.

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

Attachments:

v2-pgbench-prepare.patchtext/x-diff; name=v2-pgbench-prepare.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 364b5a2e47..4e6db32fc9 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2858,6 +2858,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+						commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -2891,42 +2915,6 @@ sendCommand(CState *st, Command *command)
 		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-				char		name[MAX_PREPARE_NAME];
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
 		getQueryParams(st, command, params);
 		preparedStatementName(name, st->use_file, st->command);
 
@@ -3194,6 +3182,11 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					memset(st->prepared, 0, sizeof(st->prepared));
 				}
 
+
+				/* Prepare SQL commands if needed */
+				if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+					prepareCommands(st);
+
 				/* record transaction start time */
 				st->txn_begin = now;
 
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 3aa9d5d753..48d1e29f16 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -878,6 +878,23 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+pgbench(
+	'-t 1 -n -M prepared',
+	0,
+	[ qr{type: .*/001_pgbench_pipeline_serializable}, qr{actually processed: 1/1} ],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
#5Daniel Gustafsson
daniel@yesql.se
In reply to: Yugo NAGATA (#4)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 21 Jul 2021, at 03:49, Yugo NAGATA <nagata@sraoss.co.jp> wrote:

I attached the updated patch v2, which includes a comment fix and a TAP test.

This patch fails the TAP test for pgbench:

# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 25 just after 224.
t/001_pgbench_with_server.pl ..
Dubious, test returned 25 (wstat 6400, 0x1900)
All 224 subtests passed
t/002_pgbench_no_server.pl .... ok
Test Summary Report
-------------------
t/001_pgbench_with_server.pl (Wstat: 6400 Tests: 224 Failed: 0)
Non-zero exit status: 25
Parse errors: No plan found in TAP output
Files=2, Tests=426, 3 wallclock secs ( 0.04 usr 0.00 sys + 1.20 cusr 0.36 csys = 1.60 CPU)
Result: FAIL

--
Daniel Gustafsson https://vmware.com/

#6Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Daniel Gustafsson (#5)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hello Daniel Gustafsson,

On Mon, 15 Nov 2021 14:13:32 +0100
Daniel Gustafsson <daniel@yesql.se> wrote:

On 21 Jul 2021, at 03:49, Yugo NAGATA <nagata@sraoss.co.jp> wrote:

I attached the updated patch v2, which includes a comment fix and a TAP test.

This patch fails the TAP test for pgbench:

Thank you for pointing it out!
I attached the updated patch.

Regards,
Yugo Nagata

# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 25 just after 224.
t/001_pgbench_with_server.pl ..
Dubious, test returned 25 (wstat 6400, 0x1900)
All 224 subtests passed
t/002_pgbench_no_server.pl .... ok
Test Summary Report
-------------------
t/001_pgbench_with_server.pl (Wstat: 6400 Tests: 224 Failed: 0)
Non-zero exit status: 25
Parse errors: No plan found in TAP output
Files=2, Tests=426, 3 wallclock secs ( 0.04 usr 0.00 sys + 1.20 cusr 0.36 csys = 1.60 CPU)
Result: FAIL

--
Daniel Gustafsson https://vmware.com/

--
Yugo NAGATA <nagata@sraoss.co.jp>

Attachments:

v3-pgbench-prepare.patchtext/x-diff; name=v3-pgbench-prepare.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index c12b6f0615..c7c3963600 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2859,6 +2859,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+						commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -2892,42 +2916,6 @@ sendCommand(CState *st, Command *command)
 		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-				char		name[MAX_PREPARE_NAME];
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
 		getQueryParams(st, command, params);
 		preparedStatementName(name, st->use_file, st->command);
 
@@ -3199,6 +3187,11 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					memset(st->prepared, 0, sizeof(st->prepared));
 				}
 
+
+				/* Prepare SQL commands if needed */
+				if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+					prepareCommands(st);
+
 				/* record transaction start time */
 				st->txn_begin = now;
 
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 69ffa595dd..7b7a8518d1 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -830,6 +830,23 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-t 1 -n -M prepared',
+	0,
+	[ qr{type: .*/001_pgbench_pipeline_serializable}, qr{actually processed: 1/1} ],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
#7Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Yugo NAGATA (#4)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

At Tue, 16 Nov 2021 02:26:43 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in

Thank you for pointing it out!
I attached the updated patch.

I think we want more elabolative comment for the new place of
preparing as you mentioned in the first mail.

At Fri, 16 Jul 2021 15:30:13 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in

One way to avoid these errors is to send Parse messages before
pipeline mode starts.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#8Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Kyotaro Horiguchi (#7)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hi Horiguchi-san,

On Thu, 27 Jan 2022 17:50:25 +0900 (JST)
Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Tue, 16 Nov 2021 02:26:43 +0900, Yugo NAGATA <nagata@sraoss.co.jp> wrote in

Thank you for pointing it out!
I attached the updated patch.

I think we want more elabolative comment for the new place of
preparing as you mentioned in the first mail.

Thank you for your suggestion.

I added comments on the prepareCommands() call as in the updated patch.

Regards,
Yugo Nagata

Yugo NAGATA <nagata@sraoss.co.jp>

Attachments:

v4-pgbench-prepare.patchtext/x-diff; name=v4-pgbench-prepare.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index f166a77e3a..c5893eabe4 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2832,6 +2832,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+						commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -2865,42 +2889,6 @@ sendCommand(CState *st, Command *command)
 		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-				char		name[MAX_PREPARE_NAME];
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
 		getQueryParams(st, command, params);
 		preparedStatementName(name, st->use_file, st->command);
 
@@ -3172,6 +3160,20 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					memset(st->prepared, 0, sizeof(st->prepared));
 				}
 
+				/*
+				 * Prepare SQL commands if needed.
+				 *
+				 * We should send Parse messages before executing meta commands
+				 * especially /startpipeline. If a Parse message is sent in
+				 * pipeline mode, a transaction starts before BEGIN is sent, and
+				 * it could be a problem. For example, "BEGIN ISOLATION LEVEL
+				 * SERIALIZABLE" is sent after a transaction starts, the error
+				 * "ERROR:  SET TRANSACTION ISOLATION LEVEL must be called before any query"
+				 * occurs.
+				 */
+				if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+					prepareCommands(st);
+
 				/* record transaction start time */
 				st->txn_begin = now;
 
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index f1341092fe..7e4ec728e8 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -837,6 +837,23 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-t 1 -n -M prepared',
+	0,
+	[ qr{type: .*/001_pgbench_pipeline_serializable}, qr{actually processed: 1/1} ],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#2)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Fabien COELHO <coelho@cri.ensmp.fr> writes:

[...] One way to avoid these errors is to send Parse messages before
pipeline mode starts. I attached a patch to fix to prepare commands at
starting of a script instead of at the first execution of the command.

ISTM that moving prepare out of command execution is a good idea, so I'm
in favor of this approach: the code is simpler and cleaner.
ISTM that a minor impact is that the preparation is not counted in the
command performance statistics. I do not think that it is a problem, even
if it would change detailed results under -C -r -M prepared.

I am not convinced this is a great idea. The current behavior is that
a statement is not prepared until it's about to be executed, and I think
we chose that deliberately to avoid semantic differences between prepared
and not-prepared mode. For example, if a script looks like

CREATE FUNCTION foo(...) ...;
SELECT foo(...);
DROP FUNCTION foo;

trying to prepare the SELECT in advance would lead to failure.

We could perhaps get away with preparing the commands within a pipeline
just before we start to execute the pipeline, but it looks to me like
this patch tries to prepare the entire script in advance.

BTW, the cfbot says the patch is failing to apply anyway ...
I think it was sideswiped by 4a39f87ac.

regards, tom lane

#10Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Tom Lane (#9)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fabien COELHO <coelho@cri.ensmp.fr> writes:

[...] One way to avoid these errors is to send Parse messages before
pipeline mode starts. I attached a patch to fix to prepare commands at
starting of a script instead of at the first execution of the command.

ISTM that moving prepare out of command execution is a good idea, so I'm
in favor of this approach: the code is simpler and cleaner.
ISTM that a minor impact is that the preparation is not counted in the
command performance statistics. I do not think that it is a problem, even
if it would change detailed results under -C -r -M prepared.

I am not convinced this is a great idea. The current behavior is that
a statement is not prepared until it's about to be executed, and I think
we chose that deliberately to avoid semantic differences between prepared
and not-prepared mode. For example, if a script looks like

CREATE FUNCTION foo(...) ...;
SELECT foo(...);
DROP FUNCTION foo;

trying to prepare the SELECT in advance would lead to failure.

We could perhaps get away with preparing the commands within a pipeline
just before we start to execute the pipeline, but it looks to me like
this patch tries to prepare the entire script in advance.

Well, the semantic differences is already in the current behavior.
Currently, pgbench fails to execute the above script in prepared mode
because it prepares the entire script in advance just before the first
command execution. This patch does not change the semantic.

BTW, the cfbot says the patch is failing to apply anyway ...
I think it was sideswiped by 4a39f87ac.

I attached the rebased patch.

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

Attachments:

v5-pgbench-prepare.patchtext/x-diff; name=v5-pgbench-prepare.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index acf3e56413..bf8fecf219 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3074,6 +3074,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+						commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -3107,42 +3131,6 @@ sendCommand(CState *st, Command *command)
 		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-				char		name[MAX_PREPARE_NAME];
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
 		getQueryParams(&st->variables, command, params);
 		preparedStatementName(name, st->use_file, st->command);
 
@@ -3619,6 +3607,20 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					memset(st->prepared, 0, sizeof(st->prepared));
 				}
 
+				/*
+				 * Prepare SQL commands if needed.
+				 *
+				 * We should send Parse messages before executing meta commands
+				 * especially /startpipeline. If a Parse message is sent in
+				 * pipeline mode, a transaction starts before BEGIN is sent, and
+				 * it could be a problem. For example, "BEGIN ISOLATION LEVEL
+				 * SERIALIZABLE" is sent after a transaction starts, the error
+				 * "ERROR:  SET TRANSACTION ISOLATION LEVEL must be called before any query"
+				 * occurs.
+				 */
+				if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+					prepareCommands(st);
+
 				/*
 				 * It is the first try to run this transaction. Remember the
 				 * random state: maybe it will get an error and we will need to
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index ca71f968dc..bcd8abe739 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -839,6 +839,23 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-t 1 -n -M prepared',
+	0,
+	[ qr{type: .*/001_pgbench_pipeline_serializable}, qr{actually processed: 1/1} ],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
#11Ibrar Ahmed
ibrar.ahmad@gmail.com
In reply to: Yugo NAGATA (#10)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On Mon, Mar 28, 2022 at 8:35 AM Yugo NAGATA <nagata@sraoss.co.jp> wrote:

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Fabien COELHO <coelho@cri.ensmp.fr> writes:

[...] One way to avoid these errors is to send Parse messages before
pipeline mode starts. I attached a patch to fix to prepare commands

at

starting of a script instead of at the first execution of the command.

ISTM that moving prepare out of command execution is a good idea, so

I'm

in favor of this approach: the code is simpler and cleaner.
ISTM that a minor impact is that the preparation is not counted in the
command performance statistics. I do not think that it is a problem,

even

if it would change detailed results under -C -r -M prepared.

I am not convinced this is a great idea. The current behavior is that
a statement is not prepared until it's about to be executed, and I think
we chose that deliberately to avoid semantic differences between prepared
and not-prepared mode. For example, if a script looks like

CREATE FUNCTION foo(...) ...;
SELECT foo(...);
DROP FUNCTION foo;

trying to prepare the SELECT in advance would lead to failure.

We could perhaps get away with preparing the commands within a pipeline
just before we start to execute the pipeline, but it looks to me like
this patch tries to prepare the entire script in advance.

Well, the semantic differences is already in the current behavior.
Currently, pgbench fails to execute the above script in prepared mode
because it prepares the entire script in advance just before the first
command execution. This patch does not change the semantic.

BTW, the cfbot says the patch is failing to apply anyway ...
I think it was sideswiped by 4a39f87ac.

I attached the rebased patch.

Regards,
Yugo Nagata

--
Yugo NAGATA <nagata@sraoss.co.jp>

Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,

Since you haven't had time to write a review the last many days, the author
replied
with a rebased patch for a long time and never heard. We've taken your name
off the reviewer list for this patch. Of course, you are still welcome to
review it if you can
find the time. We're removing your name so that other reviewers know the
patch still needs
attention. We understand that day jobs and other things get in the way of
doing patch
reviews when you want to, so please come back and review a patch or two
later when you
have more time.

--
Ibrar Ahmed

#12Julien Rouhaud
rjuju123@gmail.com
In reply to: Ibrar Ahmed (#11)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hi,

On Sat, Sep 03, 2022 at 10:36:37AM +0500, Ibrar Ahmed wrote:

Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,

Since you haven't had time to write a review the last many days, the author
replied
with a rebased patch for a long time and never heard. We've taken your name
off the reviewer list for this patch. Of course, you are still welcome to
review it if you can
find the time. We're removing your name so that other reviewers know the
patch still needs
attention. We understand that day jobs and other things get in the way of
doing patch
reviews when you want to, so please come back and review a patch or two
later when you
have more time.

I thought that we decided not to remove assigned reviewers from a CF entry,
even if they didn't reply recently? See the discussion around
/messages/by-id/CA+TgmoZSBNhX0zCkG5T5KiQize9Aq4+ec+uqLcfBhm_+12MbQA@mail.gmail.com

#13Ibrar Ahmed
ibrar.ahmad@gmail.com
In reply to: Julien Rouhaud (#12)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On Sat, Sep 3, 2022 at 12:09 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

Hi,

On Sat, Sep 03, 2022 at 10:36:37AM +0500, Ibrar Ahmed wrote:

Hi Kyotaro Horiguchi, Fabien Coelho, Daniel Gustafsson,

Since you haven't had time to write a review the last many days, the

author

replied
with a rebased patch for a long time and never heard. We've taken your

name

off the reviewer list for this patch. Of course, you are still welcome to
review it if you can
find the time. We're removing your name so that other reviewers know the
patch still needs
attention. We understand that day jobs and other things get in the way of
doing patch
reviews when you want to, so please come back and review a patch or two
later when you
have more time.

I thought that we decided not to remove assigned reviewers from a CF entry,
even if they didn't reply recently? See the discussion around

/messages/by-id/CA+TgmoZSBNhX0zCkG5T5KiQize9Aq4+ec+uqLcfBhm_+12MbQA@mail.gmail.com

Ah, ok, thanks for the clarification. I will add them back.

@Jacob Champion, we need to update the CommitFest Checklist [1]https://wiki.postgresql.org/wiki/CommitFest_Checklist document
accordingly.

*"Reviewer Clear [reviewer name]:*

* Since you haven't had time to write a review of [patch] in the last 5
days, we've taken your name off the reviewer list for this patch."*

[1]: https://wiki.postgresql.org/wiki/CommitFest_Checklist

--
Ibrar Ahmed

#14Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Yugo NAGATA (#10)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2022-Mar-28, Yugo NAGATA wrote:

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

I am not convinced this is a great idea. The current behavior is that
a statement is not prepared until it's about to be executed, and I think
we chose that deliberately to avoid semantic differences between prepared
and not-prepared mode. For example, if a script looks like

CREATE FUNCTION foo(...) ...;
SELECT foo(...);
DROP FUNCTION foo;

trying to prepare the SELECT in advance would lead to failure.

We could perhaps get away with preparing the commands within a pipeline
just before we start to execute the pipeline, but it looks to me like
this patch tries to prepare the entire script in advance.

Maybe it would work to have one extra boolean in struct Command, indicating
that the i-th command in the script is inside a pipeline; in -M
prepared, issue PREPARE for each command marked with that flag ahead of
time, and for all other commands, do as today. That way, we don't
change behavior for anything except those commands that need the change.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Digital and video cameras have this adjustment and film cameras don't for the
same reason dogs and cats lick themselves: because they can." (Ken Rockwell)

#15Yugo NAGATA
nagata@sraoss.co.jp
In reply to: Alvaro Herrera (#14)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hi,

On Mon, 12 Sep 2022 17:03:43 +0200
Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2022-Mar-28, Yugo NAGATA wrote:

On Fri, 25 Mar 2022 16:19:54 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

I am not convinced this is a great idea. The current behavior is that
a statement is not prepared until it's about to be executed, and I think
we chose that deliberately to avoid semantic differences between prepared
and not-prepared mode. For example, if a script looks like

CREATE FUNCTION foo(...) ...;
SELECT foo(...);
DROP FUNCTION foo;

trying to prepare the SELECT in advance would lead to failure.

We could perhaps get away with preparing the commands within a pipeline
just before we start to execute the pipeline, but it looks to me like
this patch tries to prepare the entire script in advance.

Maybe it would work to have one extra boolean in struct Command, indicating
that the i-th command in the script is inside a pipeline; in -M
prepared, issue PREPARE for each command marked with that flag ahead of
time, and for all other commands, do as today. That way, we don't
change behavior for anything except those commands that need the change.

Well, I still don't understand why we need to prepare only "the
commands within a pipeline" before starting pipeline. In the current
behavior, the entire script is prepared in advance just before executing
the first SQL command in the script, instead of preparing each command
one by one. This patch also prepare the entire script in advance, so
there is no behavioural change in this sense.

However, there are a few behavioural changes. One is that the preparation
is not counted in the command performance statistics as Fabien mentioned.
Another is that all meta-commands including \shell and \sleep etc. are
executed before the preparation.

To reduce impact of these changes, I updated the patch to prepare the
commands just before executing the first SQL command or \startpipeline
meta-command instead of at the beginning of the script.

Regards,
Yugo Nagata

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Digital and video cameras have this adjustment and film cameras don't for the
same reason dogs and cats lick themselves: because they can." (Ken Rockwell)

--
Yugo NAGATA <nagata@sraoss.co.jp>

Attachments:

v6-pgbench-prepare.patchtext/x-diff; name=v6-pgbench-prepare.patchDownload
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index aa1a3541fe..12a268fe84 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3070,6 +3070,30 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/* Prepare SQL commands in the chosen script */
+static void
+prepareCommands(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	for (j = 0; commands[j] != NULL; j++)
+	{
+		PGresult   *res;
+		char		name[MAX_PREPARE_NAME];
+
+		if (commands[j]->type != SQL_COMMAND)
+			continue;
+		preparedStatementName(name, st->use_file, j);
+		res = PQprepare(st->con, name,
+						commands[j]->argv[0], commands[j]->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+	}
+	st->prepared[st->use_file] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -3104,39 +3128,7 @@ sendCommand(CState *st, Command *command)
 		const char *params[MAX_ARGS];
 
 		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
+			prepareCommands(st);
 
 		getQueryParams(&st->variables, command, params);
 		preparedStatementName(name, st->use_file, st->command);
@@ -4376,6 +4368,19 @@ executeMetaCommand(CState *st, pg_time_usec_t *now)
 			commandFailed(st, "startpipeline", "cannot use pipeline mode with the simple query protocol");
 			return CSTATE_ABORTED;
 		}
+		/*
+		 * Prepare SQL commands if needed.
+		 *
+		 * We should send Parse messages before executing /startpipeline.
+		 * If a Parse message is sent in pipeline mode, a transaction starts
+		 * before BEGIN is sent, and it could be a problem. For example, if
+		 * "BEGIN ISOLATION LEVEL SERIALIZABLE" is sent after a transaction
+		 * starts, the error
+		 * "ERROR:  SET TRANSACTION ISOLATION LEVEL must be called before any query"
+		 * occurs.
+		 */
+		else if (querymode == QUERY_PREPARED && !st->prepared[st->use_file])
+			prepareCommands(st);
 
 		if (PQpipelineStatus(st->con) != PQ_PIPELINE_OFF)
 		{
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 2c0dc36965..e70a7d966c 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -839,6 +839,23 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-t 1 -n -M prepared',
+	0,
+	[ qr{type: .*/001_pgbench_pipeline_serializable}, qr{actually processed: 1/1} ],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
#16Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Yugo NAGATA (#15)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

I'm writing my own patch for this problem. While playing around with
it, I noticed this:

struct Command {
PQExpBufferData lines; /* 0 24 */
char * first_line; /* 24 8 */
int type; /* 32 4 */
MetaCommand meta; /* 36 4 */
int argc; /* 40 4 */

/* XXX 4 bytes hole, try to pack */

char * argv[256]; /* 48 2048 */
/* --- cacheline 32 boundary (2048 bytes) was 48 bytes ago --- */
char * varprefix; /* 2096 8 */
PgBenchExpr * expr; /* 2104 8 */
/* --- cacheline 33 boundary (2112 bytes) --- */
SimpleStats stats; /* 2112 40 */
int64 retries; /* 2152 8 */
int64 failures; /* 2160 8 */

/* size: 2168, cachelines: 34, members: 11 */
/* sum members: 2164, holes: 1, sum holes: 4 */
/* last cacheline: 56 bytes */
};

Not great. I suppose this makes pgbench slower than it needs to be.
Can we do better?

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Nunca se desea ardientemente lo que solo se desea por razón" (F. Alexandre)

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#16)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

I'm writing my own patch for this problem. While playing around with
it, I noticed this:

struct Command {
/* size: 2168, cachelines: 34, members: 11 */
/* sum members: 2164, holes: 1, sum holes: 4 */
/* last cacheline: 56 bytes */
};

I think the original intent was for argv[] to be at the end,
which fell victim to ye olde add-at-the-end antipattern.
Cache-friendliness-wise, putting it back to the end would
likely be enough. But turning it into a variable-size array
would be better from a functionality standpoint.

regards, tom lane

#18Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Yugo NAGATA (#15)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2022-Sep-30, Yugo NAGATA wrote:

Well, I still don't understand why we need to prepare only "the
commands within a pipeline" before starting pipeline. In the current
behavior, the entire script is prepared in advance just before executing
the first SQL command in the script, instead of preparing each command
one by one. This patch also prepare the entire script in advance, so
there is no behavioural change in this sense.

However, there are a few behavioural changes. One is that the preparation
is not counted in the command performance statistics as Fabien mentioned.
Another is that all meta-commands including \shell and \sleep etc. are
executed before the preparation.

To reduce impact of these changes, I updated the patch to prepare the
commands just before executing the first SQL command or \startpipeline
meta-command instead of at the beginning of the script.

I propose instead the following: each command is prepared just before
it's executed, as previously, and if we see a \startpipeline, then we
prepare all commands starting with the one just after, and until the
\endpipeline.

I didn't test additional cases other than the one you submitted.

Testing this I noticed that pg_log_debug et al don't support
multithreading very well -- the lines are interspersed.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/

#19Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#18)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2023-Feb-08, Alvaro Herrera wrote:

I propose instead the following: each command is prepared just before
it's executed, as previously, and if we see a \startpipeline, then we
prepare all commands starting with the one just after, and until the
\endpipeline.

Here's the patch.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

Attachments:

v7-0001-Prepare-commands-in-pipelines-in-advance.patchtext/x-diff; charset=us-asciiDownload
From 6d8938009b246463efe4104f5312e37b28b55235 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Wed, 8 Feb 2023 13:01:47 +0100
Subject: [PATCH v7] Prepare commands in pipelines in advance

---
 src/bin/pgbench/pgbench.c                    | 147 +++++++++++++------
 src/bin/pgbench/t/001_pgbench_with_server.pl |  17 +++
 2 files changed, 117 insertions(+), 47 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 508ed218e8..a66fd51f92 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -628,7 +628,8 @@ typedef struct
 	pg_time_usec_t txn_begin;	/* used for measuring schedule lag times */
 	pg_time_usec_t stmt_begin;	/* used for measuring statement latencies */
 
-	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
+	/* whether client prepared each command of each script */
+	bool	  **prepared;
 
 	/*
 	 * For processing failures and repeating transactions with serialization
@@ -739,6 +740,7 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
  * aset			do gset on all possible queries of a combined query (\;).
  * expr			Parsed expression, if needed.
  * stats		Time spent in this command.
+ * prepname		The name that this command is prepared under, in prepare mode
  * retries		Number of retries after a serialization or deadlock error in the
  *				current command.
  * failures		Number of errors in the current command that were not retried.
@@ -754,6 +756,7 @@ typedef struct Command
 	char	   *varprefix;
 	PgBenchExpr *expr;
 	SimpleStats stats;
+	char	   *prepname;
 	int64		retries;
 	int64		failures;
 } Command;
@@ -3006,13 +3009,6 @@ runShellCommand(Variables *variables, char *variable, char **argv, int argc)
 	return true;
 }
 
-#define MAX_PREPARE_NAME		32
-static void
-preparedStatementName(char *buffer, int file, int state)
-{
-	sprintf(buffer, "P%d_%d", file, state);
-}
-
 /*
  * Report the abortion of the client when processing SQL commands.
  */
@@ -3053,6 +3049,87 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/*
+ * Prepare the SQL command from st->use_file at command_num.  The name of the
+ * new prepared statement is stored in command->prepname.
+ */
+static void
+prepareCommand(CState *st, int command_num)
+{
+	Command    *command = sql_script[st->use_file].commands[command_num];
+	static int	prepnr = 0;
+
+	/* No prepare for non-SQL commands */
+	if (command->type != SQL_COMMAND)
+		return;
+
+	if (!st->prepared)
+	{
+		st->prepared = pg_malloc(sizeof(bool *) * num_scripts);
+		for (int i = 0; i < num_scripts; i++)
+		{
+			ParsedScript *script = &sql_script[i];
+			int			numcmds;
+
+			for (numcmds = 0; script->commands[numcmds] != NULL; numcmds++)
+				;
+			st->prepared[i] = pg_malloc0(numcmds);
+		}
+	}
+
+	if (!st->prepared[st->use_file][command_num])
+	{
+		PGresult   *res;
+
+		if (!command->prepname)
+			command->prepname = psprintf("P%d_%d", st->use_file, prepnr++);
+		pg_log_debug("client %d preparing %s", st->id, command->prepname);
+		res = PQprepare(st->con, command->prepname,
+						command->argv[0], command->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+		st->prepared[st->use_file][command_num] = true;
+	}
+}
+
+/*
+ * Prepare all the commands in the script that come after the \startpipeline
+ * that's at position st->command, and the first \endpipeline we find.
+ *
+ * (This sets the ->prepared flag for each relevant command, but doesn't move
+ * the st->command counter)
+ */
+static void
+prepareCommandsInPipeline(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	Assert(commands[st->command]->type == META_COMMAND &&
+		   commands[st->command]->meta == META_STARTPIPELINE);
+
+	/*
+	 * We set the 'prepared' flag on the \startpipeline itself to flag that we
+	 * don't need to do this next time, even though we don't actually prepare
+	 * it.
+	 */
+	if (st->prepared &&
+		st->prepared[st->use_file][st->command])
+		return;
+
+	for (j = st->command + 1; commands[j] != NULL; j++)
+	{
+		if (commands[j]->type == META_COMMAND &&
+			commands[j]->meta == META_ENDPIPELINE)
+			break;
+
+		prepareCommand(st, j);
+	}
+
+	st->prepared[st->use_file][st->command] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -3083,49 +3160,14 @@ sendCommand(CState *st, Command *command)
 	}
 	else if (querymode == QUERY_PREPARED)
 	{
-		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
+		prepareCommand(st, st->command);
 
 		getQueryParams(&st->variables, command, params);
-		preparedStatementName(name, st->use_file, st->command);
 
-		pg_log_debug("client %d sending %s", st->id, name);
-		r = PQsendQueryPrepared(st->con, name, command->argc - 1,
+		pg_log_debug("client %d sending %s", st->id, command->prepname);
+		r = PQsendQueryPrepared(st->con, command->prepname, command->argc - 1,
 								params, NULL, NULL, 0);
 	}
 	else						/* unknown sql mode */
@@ -3597,7 +3639,8 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					thread->conn_duration += now - start;
 
 					/* Reset session-local state */
-					memset(st->prepared, 0, sizeof(st->prepared));
+					pg_free(st->prepared);
+					st->prepared = NULL;
 				}
 
 				/*
@@ -4360,6 +4403,16 @@ executeMetaCommand(CState *st, pg_time_usec_t *now)
 			return CSTATE_ABORTED;
 		}
 
+		/*
+		 * If we're in prepared-query mode, we need to prepare all the
+		 * commands that are inside the pipeline.  This solves the problem
+		 * that running BEGIN ISOLATION LEVEL SERIALIZABLE would fail, because
+		 * by the time the command is executed, a snapshot would have already
+		 * been acquired, so an error would be thrown.
+		 */
+		if (querymode == QUERY_PREPARED)
+			prepareCommandsInPipeline(st);
+
 		if (PQpipelineStatus(st->con) != PQ_PIPELINE_OFF)
 		{
 			commandFailed(st, "startpipeline", "already in pipeline mode");
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 4bf508ea96..dfdeb1f70f 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -839,6 +839,23 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-t 1 -n -M prepared',
+	0,
+	[ qr{type: .*/001_pgbench_pipeline_serializable}, qr{actually processed: 1/1} ],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
-- 
2.30.2

#20Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#19)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hi,

On 2023-02-08 13:10:40 +0100, Alvaro Herrera wrote:

On 2023-Feb-08, Alvaro Herrera wrote:

I propose instead the following: each command is prepared just before
it's executed, as previously, and if we see a \startpipeline, then we
prepare all commands starting with the one just after, and until the
\endpipeline.

Here's the patch.

There's something wrong with the patch, it reliably fails with core dumps:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3260

Example crash:
https://api.cirrus-ci.com/v1/task/4922406553255936/logs/cores.log
https://api.cirrus-ci.com/v1/artifact/task/6611256413519872/crashlog/crashlog-pgbench.EXE_0750_2023-02-13_14-07-06-189.txt

Andres

#21Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andres Freund (#20)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2023-Feb-13, Andres Freund wrote:

There's something wrong with the patch, it reliably fails with core dumps:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F42%2F3260

I think this would happen on machines where sizeof(bool) is not 1 (which
mine is evidently not). Fixed.

In addition, there was the problem that the psprintf() to generate the
command name would race against each other if you had multiple threads.
I changed the code so that the name to prepare each statement under is
generated when the Command struct is first initialized, which occurs
before the threads are started. One small issue is that now we use a
single counter for all commands of all scripts, rather than a
script-local counter. This doesn't seem at all important.

I did realize that Nagata-san was right that we've always prepared the
whole script in advance; that behavior was there already in commit
49639a7b2c52 that introduced -Mprepared. We've never done each command
just before executing it.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Y una voz del caos me habló y me dijo
"Sonríe y sé feliz, podría ser peor".
Y sonreí. Y fui feliz.
Y fue peor.

Attachments:

v8-0001-pgbench-Prepare-commands-in-pipelines-in-advance.patchtext/x-diff; charset=us-asciiDownload
From 0728193a5f02d0dd6a1f3ec5fef314aec646ba33 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Fri, 17 Feb 2023 21:01:15 +0100
Subject: [PATCH v8] pgbench: Prepare commands in pipelines in advance

Failing to do so results in an error when a pgbench script starts a
serializable transaction inside a pipeline.
---
 src/bin/pgbench/pgbench.c                    | 155 +++++++++++++------
 src/bin/pgbench/t/001_pgbench_with_server.pl |  20 +++
 2 files changed, 126 insertions(+), 49 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 508ed218e8..38e0830e7e 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -628,7 +628,8 @@ typedef struct
 	pg_time_usec_t txn_begin;	/* used for measuring schedule lag times */
 	pg_time_usec_t stmt_begin;	/* used for measuring statement latencies */
 
-	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
+	/* whether client prepared each command of each script */
+	bool	  **prepared;
 
 	/*
 	 * For processing failures and repeating transactions with serialization
@@ -733,12 +734,13 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
  * argv			Command arguments, the first of which is the command or SQL
  *				string itself.  For SQL commands, after post-processing
  *				argv[0] is the same as 'lines' with variables substituted.
- * varprefix 	SQL commands terminated with \gset or \aset have this set
+ * varprefix	SQL commands terminated with \gset or \aset have this set
  *				to a non NULL value.  If nonempty, it's used to prefix the
  *				variable name that receives the value.
  * aset			do gset on all possible queries of a combined query (\;).
  * expr			Parsed expression, if needed.
  * stats		Time spent in this command.
+ * prepname		The name that this command is prepared under, in prepare mode
  * retries		Number of retries after a serialization or deadlock error in the
  *				current command.
  * failures		Number of errors in the current command that were not retried.
@@ -754,6 +756,7 @@ typedef struct Command
 	char	   *varprefix;
 	PgBenchExpr *expr;
 	SimpleStats stats;
+	char	   *prepname;
 	int64		retries;
 	int64		failures;
 } Command;
@@ -3006,13 +3009,6 @@ runShellCommand(Variables *variables, char *variable, char **argv, int argc)
 	return true;
 }
 
-#define MAX_PREPARE_NAME		32
-static void
-preparedStatementName(char *buffer, int file, int state)
-{
-	sprintf(buffer, "P%d_%d", file, state);
-}
-
 /*
  * Report the abortion of the client when processing SQL commands.
  */
@@ -3053,6 +3049,87 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/*
+ * Prepare the SQL command from st->use_file at command_num.
+ */
+static void
+prepareCommand(CState *st, int command_num)
+{
+	Command    *command = sql_script[st->use_file].commands[command_num];
+
+	/* No prepare for non-SQL commands */
+	if (command->type != SQL_COMMAND)
+		return;
+
+	/*
+	 * If not already done, allocate space for 'prepared' flags: one boolean
+	 * for each command of each script.
+	 */
+	if (!st->prepared)
+	{
+		st->prepared = pg_malloc(sizeof(bool *) * num_scripts);
+		for (int i = 0; i < num_scripts; i++)
+		{
+			ParsedScript *script = &sql_script[i];
+			int			numcmds;
+
+			for (numcmds = 0; script->commands[numcmds] != NULL; numcmds++)
+				;
+			st->prepared[i] = pg_malloc0(sizeof(bool) * numcmds);
+		}
+	}
+
+	if (!st->prepared[st->use_file][command_num])
+	{
+		PGresult   *res;
+
+		pg_log_debug("client %d preparing %s", st->id, command->prepname);
+		res = PQprepare(st->con, command->prepname,
+						command->argv[0], command->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+		st->prepared[st->use_file][command_num] = true;
+	}
+}
+
+/*
+ * Prepare all the commands in the script that come after the \startpipeline
+ * that's at position st->command, and the first \endpipeline we find.
+ *
+ * (This sets the ->prepared flag for each relevant command, but doesn't move
+ * the st->command counter)
+ */
+static void
+prepareCommandsInPipeline(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	Assert(commands[st->command]->type == META_COMMAND &&
+		   commands[st->command]->meta == META_STARTPIPELINE);
+
+	/*
+	 * We set the 'prepared' flag on the \startpipeline itself to flag that we
+	 * don't need to do this next time without calling prepareCommand(), even
+	 * though we don't actually prepare this command.
+	 */
+	if (st->prepared &&
+		st->prepared[st->use_file][st->command])
+		return;
+
+	for (j = st->command + 1; commands[j] != NULL; j++)
+	{
+		if (commands[j]->type == META_COMMAND &&
+			commands[j]->meta == META_ENDPIPELINE)
+			break;
+
+		prepareCommand(st, j);
+	}
+
+	st->prepared[st->use_file][st->command] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -3083,49 +3160,13 @@ sendCommand(CState *st, Command *command)
 	}
 	else if (querymode == QUERY_PREPARED)
 	{
-		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
+		prepareCommand(st, st->command);
 		getQueryParams(&st->variables, command, params);
-		preparedStatementName(name, st->use_file, st->command);
 
-		pg_log_debug("client %d sending %s", st->id, name);
-		r = PQsendQueryPrepared(st->con, name, command->argc - 1,
+		pg_log_debug("client %d sending %s", st->id, command->prepname);
+		r = PQsendQueryPrepared(st->con, command->prepname, command->argc - 1,
 								params, NULL, NULL, 0);
 	}
 	else						/* unknown sql mode */
@@ -3597,7 +3638,8 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					thread->conn_duration += now - start;
 
 					/* Reset session-local state */
-					memset(st->prepared, 0, sizeof(st->prepared));
+					pg_free(st->prepared);
+					st->prepared = NULL;
 				}
 
 				/*
@@ -4360,6 +4402,16 @@ executeMetaCommand(CState *st, pg_time_usec_t *now)
 			return CSTATE_ABORTED;
 		}
 
+		/*
+		 * If we're in prepared-query mode, we need to prepare all the
+		 * commands that are inside the pipeline before we actually start the
+		 * pipeline itself.  This solves the problem that running BEGIN
+		 * ISOLATION LEVEL SERIALIZABLE in a pipeline would fail due to a
+		 * snapshot having been acquired by the prepare within the pipeline.
+		 */
+		if (querymode == QUERY_PREPARED)
+			prepareCommandsInPipeline(st);
+
 		if (PQpipelineStatus(st->con) != PQ_PIPELINE_OFF)
 		{
 			commandFailed(st, "startpipeline", "already in pipeline mode");
@@ -5421,6 +5473,7 @@ create_sql_command(PQExpBuffer buf, const char *source)
 {
 	Command    *my_command;
 	char	   *p = skip_sql_comments(buf->data);
+	static int	prepnr = 0;
 
 	if (p == NULL)
 		return NULL;
@@ -5439,6 +5492,10 @@ create_sql_command(PQExpBuffer buf, const char *source)
 	my_command->varprefix = NULL;	/* allocated later, if needed */
 	my_command->expr = NULL;
 	initSimpleStats(&my_command->stats);
+	if (querymode == QUERY_PREPARED)
+		my_command->prepname = psprintf("P_%d", prepnr++);
+	else
+		my_command->prepname = NULL;
 
 	return my_command;
 }
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 4bf508ea96..99273203f0 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -839,6 +839,26 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-c4 -j2 -t 10 -n -M prepared',
+	0,
+	[
+		qr{type: .*/001_pgbench_pipeline_serializable},
+		qr{actually processed: (\d+)/\1}
+	],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
-- 
2.30.2

#22Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#21)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Found one last problem: if you do "-f foobar.sql -M prepared" in that
order, then the prepare fails because the statement names would not be
assigned when the file is parsed. This coding only supported doing
"-M prepared -f foobar.sql", which funnily enough is the only one that
PostgreSQL/Cluster.pm->pgbench() supports. So I moved the prepared
statement name generation to the postprocess step.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)

Attachments:

v9-0001-pgbench-Prepare-commands-in-pipelines-in-advance.patchtext/x-diff; charset=us-asciiDownload
From 14ed26c3f1a7033daaa93d6cd2d33d1100dd033d Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 20 Feb 2023 20:25:26 +0100
Subject: [PATCH v9] pgbench: Prepare commands in pipelines in advance

Failing to do so results in an error when a pgbench script tries to
start a serializable transaction inside a pipeline, because by the time
BEGIN ISOLATION LEVEL SERIALIZABLE is executed, we're already in a
transaction that has acquired a snapshot, so the server rightfully
complains.

We can work around that by preparing all commands in the pipeline before
actually starting the pipeline.  This changes the existing code in two
aspects: first, we now prepare each command individually at the point
where that command is about to be executed; previously, we would prepare
all commands in a script as soon as the first command of that script
would be executed.  It's hard to see that this would make much of a
difference (particularly since it only affects the first time to execute
each script in a client), but I didn't actually try to measure it.

Secondly, we no longer use PQsendPrepare() in pipeline mode, but only
PQprepare.  There's no specific reason for this change other than no
longer needing to do differently in pipeline mode.  (Previously we had
no choice, because in pipeline mode PQprepare could not be used.)

Backpatch to 14, where pgbench got support for pipeline mode.

Reported-by: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20210716153013.fc53b1c780b06fccc07a7f0d@sraoss.co.jp
---
 src/bin/pgbench/pgbench.c                    | 153 +++++++++++++------
 src/bin/pgbench/t/001_pgbench_with_server.pl |  20 +++
 2 files changed, 124 insertions(+), 49 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 508ed218e8..a3a2040a3c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -628,7 +628,8 @@ typedef struct
 	pg_time_usec_t txn_begin;	/* used for measuring schedule lag times */
 	pg_time_usec_t stmt_begin;	/* used for measuring statement latencies */
 
-	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
+	/* whether client prepared each command of each script */
+	bool	  **prepared;
 
 	/*
 	 * For processing failures and repeating transactions with serialization
@@ -733,7 +734,8 @@ static const char *QUERYMODE[] = {"simple", "extended", "prepared"};
  * argv			Command arguments, the first of which is the command or SQL
  *				string itself.  For SQL commands, after post-processing
  *				argv[0] is the same as 'lines' with variables substituted.
- * varprefix 	SQL commands terminated with \gset or \aset have this set
+ * prepname		The name that this command is prepared under, in prepare mode
+ * varprefix	SQL commands terminated with \gset or \aset have this set
  *				to a non NULL value.  If nonempty, it's used to prefix the
  *				variable name that receives the value.
  * aset			do gset on all possible queries of a combined query (\;).
@@ -751,6 +753,7 @@ typedef struct Command
 	MetaCommand meta;
 	int			argc;
 	char	   *argv[MAX_ARGS];
+	char	   *prepname;
 	char	   *varprefix;
 	PgBenchExpr *expr;
 	SimpleStats stats;
@@ -3006,13 +3009,6 @@ runShellCommand(Variables *variables, char *variable, char **argv, int argc)
 	return true;
 }
 
-#define MAX_PREPARE_NAME		32
-static void
-preparedStatementName(char *buffer, int file, int state)
-{
-	sprintf(buffer, "P%d_%d", file, state);
-}
-
 /*
  * Report the abortion of the client when processing SQL commands.
  */
@@ -3053,6 +3049,87 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/*
+ * Prepare the SQL command from st->use_file at command_num.
+ */
+static void
+prepareCommand(CState *st, int command_num)
+{
+	Command    *command = sql_script[st->use_file].commands[command_num];
+
+	/* No prepare for non-SQL commands */
+	if (command->type != SQL_COMMAND)
+		return;
+
+	/*
+	 * If not already done, allocate space for 'prepared' flags: one boolean
+	 * for each command of each script.
+	 */
+	if (!st->prepared)
+	{
+		st->prepared = pg_malloc(sizeof(bool *) * num_scripts);
+		for (int i = 0; i < num_scripts; i++)
+		{
+			ParsedScript *script = &sql_script[i];
+			int			numcmds;
+
+			for (numcmds = 0; script->commands[numcmds] != NULL; numcmds++)
+				;
+			st->prepared[i] = pg_malloc0(sizeof(bool) * numcmds);
+		}
+	}
+
+	if (!st->prepared[st->use_file][command_num])
+	{
+		PGresult   *res;
+
+		pg_log_debug("client %d preparing %s", st->id, command->prepname);
+		res = PQprepare(st->con, command->prepname,
+						command->argv[0], command->argc - 1, NULL);
+		if (PQresultStatus(res) != PGRES_COMMAND_OK)
+			pg_log_error("%s", PQerrorMessage(st->con));
+		PQclear(res);
+		st->prepared[st->use_file][command_num] = true;
+	}
+}
+
+/*
+ * Prepare all the commands in the script that come after the \startpipeline
+ * that's at position st->command, and the first \endpipeline we find.
+ *
+ * This sets the ->prepared flag for each relevant command as well as the
+ * \startpipeline itself, but doesn't move the st->command counter.
+ */
+static void
+prepareCommandsInPipeline(CState *st)
+{
+	int			j;
+	Command   **commands = sql_script[st->use_file].commands;
+
+	Assert(commands[st->command]->type == META_COMMAND &&
+		   commands[st->command]->meta == META_STARTPIPELINE);
+
+	/*
+	 * We set the 'prepared' flag on the \startpipeline itself to flag that we
+	 * don't need to do this next time without calling prepareCommand(), even
+	 * though we don't actually prepare this command.
+	 */
+	if (st->prepared &&
+		st->prepared[st->use_file][st->command])
+		return;
+
+	for (j = st->command + 1; commands[j] != NULL; j++)
+	{
+		if (commands[j]->type == META_COMMAND &&
+			commands[j]->meta == META_ENDPIPELINE)
+			break;
+
+		prepareCommand(st, j);
+	}
+
+	st->prepared[st->use_file][st->command] = true;
+}
+
 /* Send a SQL command, using the chosen querymode */
 static bool
 sendCommand(CState *st, Command *command)
@@ -3083,49 +3160,13 @@ sendCommand(CState *st, Command *command)
 	}
 	else if (querymode == QUERY_PREPARED)
 	{
-		char		name[MAX_PREPARE_NAME];
 		const char *params[MAX_ARGS];
 
-		if (!st->prepared[st->use_file])
-		{
-			int			j;
-			Command   **commands = sql_script[st->use_file].commands;
-
-			for (j = 0; commands[j] != NULL; j++)
-			{
-				PGresult   *res;
-
-				if (commands[j]->type != SQL_COMMAND)
-					continue;
-				preparedStatementName(name, st->use_file, j);
-				if (PQpipelineStatus(st->con) == PQ_PIPELINE_OFF)
-				{
-					res = PQprepare(st->con, name,
-									commands[j]->argv[0], commands[j]->argc - 1, NULL);
-					if (PQresultStatus(res) != PGRES_COMMAND_OK)
-						pg_log_error("%s", PQerrorMessage(st->con));
-					PQclear(res);
-				}
-				else
-				{
-					/*
-					 * In pipeline mode, we use asynchronous functions. If a
-					 * server-side error occurs, it will be processed later
-					 * among the other results.
-					 */
-					if (!PQsendPrepare(st->con, name,
-									   commands[j]->argv[0], commands[j]->argc - 1, NULL))
-						pg_log_error("%s", PQerrorMessage(st->con));
-				}
-			}
-			st->prepared[st->use_file] = true;
-		}
-
+		prepareCommand(st, st->command);
 		getQueryParams(&st->variables, command, params);
-		preparedStatementName(name, st->use_file, st->command);
 
-		pg_log_debug("client %d sending %s", st->id, name);
-		r = PQsendQueryPrepared(st->con, name, command->argc - 1,
+		pg_log_debug("client %d sending %s", st->id, command->prepname);
+		r = PQsendQueryPrepared(st->con, command->prepname, command->argc - 1,
 								params, NULL, NULL, 0);
 	}
 	else						/* unknown sql mode */
@@ -3597,7 +3638,8 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 					thread->conn_duration += now - start;
 
 					/* Reset session-local state */
-					memset(st->prepared, 0, sizeof(st->prepared));
+					pg_free(st->prepared);
+					st->prepared = NULL;
 				}
 
 				/*
@@ -4360,6 +4402,16 @@ executeMetaCommand(CState *st, pg_time_usec_t *now)
 			return CSTATE_ABORTED;
 		}
 
+		/*
+		 * If we're in prepared-query mode, we need to prepare all the
+		 * commands that are inside the pipeline before we actually start the
+		 * pipeline itself.  This solves the problem that running BEGIN
+		 * ISOLATION LEVEL SERIALIZABLE in a pipeline would fail due to a
+		 * snapshot having been acquired by the prepare within the pipeline.
+		 */
+		if (querymode == QUERY_PREPARED)
+			prepareCommandsInPipeline(st);
+
 		if (PQpipelineStatus(st->con) != PQ_PIPELINE_OFF)
 		{
 			commandFailed(st, "startpipeline", "already in pipeline mode");
@@ -5439,6 +5491,7 @@ create_sql_command(PQExpBuffer buf, const char *source)
 	my_command->varprefix = NULL;	/* allocated later, if needed */
 	my_command->expr = NULL;
 	initSimpleStats(&my_command->stats);
+	my_command->prepname = NULL;	/* set later, if needed */
 
 	return my_command;
 }
@@ -5468,6 +5521,7 @@ static void
 postprocess_sql_command(Command *my_command)
 {
 	char		buffer[128];
+	static int	prepnum = 0;
 
 	Assert(my_command->type == SQL_COMMAND);
 
@@ -5487,6 +5541,7 @@ postprocess_sql_command(Command *my_command)
 		case QUERY_PREPARED:
 			if (!parseQuery(my_command))
 				exit(1);
+			my_command->prepname = psprintf("P_%d", prepnum++);
 			break;
 		default:
 			exit(1);
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 4bf508ea96..99273203f0 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -839,6 +839,26 @@ select 1 \gset f
 }
 	});
 
+# Working \startpipeline in prepared query mode with serializable
+$node->pgbench(
+	'-c4 -j2 -t 10 -n -M prepared',
+	0,
+	[
+		qr{type: .*/001_pgbench_pipeline_serializable},
+		qr{actually processed: (\d+)/\1}
+	],
+	[],
+	'working \startpipeline with serializable',
+	{
+		'001_pgbench_pipeline_serializable' => q{
+-- test startpipeline with serializable
+\startpipeline
+BEGIN ISOLATION LEVEL SERIALIZABLE;
+} . "select 1;\n" x 10 . q{
+END;
+\endpipeline
+}
+	});
 
 # trigger many expression errors
 my @errors = (
-- 
2.30.2

#23Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#22)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2023-Feb-20, Alvaro Herrera wrote:

Found one last problem: if you do "-f foobar.sql -M prepared" in that
order, then the prepare fails because the statement names would not be
assigned when the file is parsed. This coding only supported doing
"-M prepared -f foobar.sql", which funnily enough is the only one that
PostgreSQL/Cluster.pm->pgbench() supports. So I moved the prepared
statement name generation to the postprocess step.

Pushed to all three branches -- thanks, Nagata-san, for diagnosing the
issue.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it." (ncm, http://lwn.net/Articles/174769/)

#24Alexander Lakhin
exclusion@gmail.com
In reply to: Alvaro Herrera (#23)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

Hello Alvaro,

21.02.2023 19:32, Alvaro Herrera wrote:

On 2023-Feb-20, Alvaro Herrera wrote:

Found one last problem: if you do "-f foobar.sql -M prepared" in that
order, then the prepare fails because the statement names would not be
assigned when the file is parsed. This coding only supported doing
"-M prepared -f foobar.sql", which funnily enough is the only one that
PostgreSQL/Cluster.pm->pgbench() supports. So I moved the prepared
statement name generation to the postprocess step.

Pushed to all three branches -- thanks, Nagata-san, for diagnosing the
issue.

Starting from 038f586d5, the following script:
echo "
\startpipeline
\endpipeline
" >test.sql
pgbench -n -M prepared -f test.sql

leads to the pgbench's segfault:
Core was generated by `pgbench -n -M prepared -f test.sql'.
Program terminated with signal SIGSEGV, Segmentation fault.

warning: Section `.reg-xstate/2327306' in core file too small.
#0  0x0000555a402546b4 in prepareCommandsInPipeline (st=st@entry=0x555a409d62e0) at pgbench.c:3130
3130            st->prepared[st->use_file][st->command] = true;
(gdb) bt
#0  0x0000555a402546b4 in prepareCommandsInPipeline (st=st@entry=0x555a409d62e0) at pgbench.c:3130
#1  0x0000555a40257fca in executeMetaCommand (st=st@entry=0x555a409d62e0, now=now@entry=0x7ffdd46eff58)
    at pgbench.c:4413
#2  0x0000555a402585ce in advanceConnectionState (thread=thread@entry=0x555a409d6580, st=st@entry=0x555a409d62e0,
    agg=agg@entry=0x7ffdd46f0090) at pgbench.c:3807
#3  0x0000555a40259564 in threadRun (arg=arg@entry=0x555a409d6580) at pgbench.c:7535
#4  0x0000555a4025ca40 in main (argc=<optimized out>, argv=<optimized out>) at pgbench.c:7253

Best regards,
Alexander

#25Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alexander Lakhin (#24)
1 attachment(s)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2023-May-20, Alexander Lakhin wrote:

Starting from 038f586d5, the following script:
echo "
\startpipeline
\endpipeline
" >test.sql
pgbench -n -M prepared -f test.sql

leads to the pgbench's segfault:

Hah, yeah, that's because an empty pipeline never calls the code to
allocate the flag array. Here's the trivial fix.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)

Attachments:

0001-Fix-pgbench-in-prepared-mode-with-empty-pipeline.patchtext/x-diff; charset=us-asciiDownload
From 50685847e057eb8e7509293fdd81247eb49854ef Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Mon, 22 May 2023 13:45:47 +0200
Subject: [PATCH] Fix pgbench in prepared mode with empty pipeline

---
 src/bin/pgbench/pgbench.c                    | 44 ++++++++++++--------
 src/bin/pgbench/t/001_pgbench_with_server.pl |  2 +
 2 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 7dbb2ed6a77..6caa84282c7 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3049,6 +3049,27 @@ chooseScript(TState *thread)
 	return i - 1;
 }
 
+/*
+ * Allocate space for 'prepared' flags of a script: we need one boolean
+ * for each command of each script.
+ */
+static void
+allocatePreparedFlags(CState *st)
+{
+	Assert(st->prepared == NULL);
+
+	st->prepared = pg_malloc(sizeof(bool *) * num_scripts);
+	for (int i = 0; i < num_scripts; i++)
+	{
+		ParsedScript *script = &sql_script[i];
+		int			numcmds;
+
+		for (numcmds = 0; script->commands[numcmds] != NULL; numcmds++)
+			;
+		st->prepared[i] = pg_malloc0(sizeof(bool) * numcmds);
+	}
+}
+
 /*
  * Prepare the SQL command from st->use_file at command_num.
  */
@@ -3061,23 +3082,8 @@ prepareCommand(CState *st, int command_num)
 	if (command->type != SQL_COMMAND)
 		return;
 
-	/*
-	 * If not already done, allocate space for 'prepared' flags: one boolean
-	 * for each command of each script.
-	 */
 	if (!st->prepared)
-	{
-		st->prepared = pg_malloc(sizeof(bool *) * num_scripts);
-		for (int i = 0; i < num_scripts; i++)
-		{
-			ParsedScript *script = &sql_script[i];
-			int			numcmds;
-
-			for (numcmds = 0; script->commands[numcmds] != NULL; numcmds++)
-				;
-			st->prepared[i] = pg_malloc0(sizeof(bool) * numcmds);
-		}
-	}
+		allocatePreparedFlags(st);
 
 	if (!st->prepared[st->use_file][command_num])
 	{
@@ -3109,13 +3115,15 @@ prepareCommandsInPipeline(CState *st)
 	Assert(commands[st->command]->type == META_COMMAND &&
 		   commands[st->command]->meta == META_STARTPIPELINE);
 
+	if (!st->prepared)
+		allocatePreparedFlags(st);
+
 	/*
 	 * We set the 'prepared' flag on the \startpipeline itself to flag that we
 	 * don't need to do this next time without calling prepareCommand(), even
 	 * though we don't actually prepare this command.
 	 */
-	if (st->prepared &&
-		st->prepared[st->use_file][st->command])
+	if (st->prepared[st->use_file][st->command])
 		return;
 
 	for (j = st->command + 1; commands[j] != NULL; j++)
diff --git a/src/bin/pgbench/t/001_pgbench_with_server.pl b/src/bin/pgbench/t/001_pgbench_with_server.pl
index 363a1ffabd5..f8ca8a922d1 100644
--- a/src/bin/pgbench/t/001_pgbench_with_server.pl
+++ b/src/bin/pgbench/t/001_pgbench_with_server.pl
@@ -790,6 +790,8 @@ $node->pgbench(
 		'001_pgbench_pipeline_prep' => q{
 -- test startpipeline
 \startpipeline
+\endpipeline
+\startpipeline
 } . "select 1;\n" x 10 . q{
 \endpipeline
 }
-- 
2.39.2

#26Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Alvaro Herrera (#25)
Re: pgbench: using prepared BEGIN statement in a pipeline could cause an error

On 2023-May-22, Alvaro Herrera wrote:

Hah, yeah, that's because an empty pipeline never calls the code to
allocate the flag array. Here's the trivial fix.

Pushed to both branches, thanks for the report.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/