Consider pipeline implicit transaction as a transaction block
Hi,
When using pipelining with implicit transaction, a transaction will
start from the first command and be committed with the Sync message.
Functions like IsInTransactionBlock and PreventInTransactionBlock
already assimilate this implicit transaction as a transaction block,
relying on the XACT_FLAGS_PIPELINING flag. However, this is not the
case in CheckTransactionBlock. This function is used for things like
warning when a local GUC is set outside of a transaction block.
The attached patch adds the detection of implicit transactions started
by a pipeline in CheckTransactionBlock, avoiding warnings when
commands like `set local` are called within a pipeline, and making the
detection of transaction block coherent with what's done in
IsInTransactionBlock and PreventInTransactionBlock.
One thing that's still not fixed by the patch is that the first
command won't be seen as part of a transaction block. For example,
with pgbench:
\startpipeline
SET LOCAL statement_timeout='200ms';
SELECT * FROM pgbench_accounts where aid=1;
\endpipeline
The XACT_FLAGS_PIPELINING will only be set after the first command, so
the warning about `set local` happening outside of a transaction block
will still be generated. However, I'm not sure if it's something
fixable (or worth fixing?). This would require to know beforehand that
there are multiple executes before the sync message, which doesn't
seem doable.
Regards,
Anthonin
Attachments:
v01-0001-Consider-pipeline-implicit-transaction-as-a-tran.patchapplication/octet-stream; name=v01-0001-Consider-pipeline-implicit-transaction-as-a-tran.patchDownload+6-1
On Wed, 30 Oct 2024 at 10:15, Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:
The attached patch adds the detection of implicit transactions started
by a pipeline in CheckTransactionBlock, avoiding warnings when
commands like `set local` are called within a pipeline, and making the
detection of transaction block coherent with what's done in
IsInTransactionBlock and PreventInTransactionBlock.
+1 seems like a reasonable change.
The XACT_FLAGS_PIPELINING will only be set after the first command, so
the warning about `set local` happening outside of a transaction block
will still be generated. However, I'm not sure if it's something
fixable (or worth fixing?). This would require to know beforehand that
there are multiple executes before the sync message, which doesn't
seem doable.
Yeah, I don't really see a way around that apart from not throwing
this warning at all when the client is using the extended protocol.
Postgres would need to be clairvoyant to know to really know if it
should show it for the first message.
On Wed, Oct 30, 2024 at 04:06:20PM +0100, Jelte Fennema-Nio wrote:
On Wed, 30 Oct 2024 at 10:15, Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:The attached patch adds the detection of implicit transactions started
by a pipeline in CheckTransactionBlock, avoiding warnings when
commands like `set local` are called within a pipeline, and making the
detection of transaction block coherent with what's done in
IsInTransactionBlock and PreventInTransactionBlock.+1 seems like a reasonable change.
That's indeed a bit strange. I think that you're right.
@Tom added in CC: Is there a specific reason why CheckTransactionBlock()
did not include a check based on XACT_FLAGS_PIPELINING when it got
introduced in 20432f873140, while IsInTransactionBlock() considers it?
This was discussed here:
/messages/by-id/17434-d9f7a064ce2a88a3@postgresql.org
--
Michael
On Thu, Oct 31, 2024 at 03:32:39PM +0900, Michael Paquier wrote:
@Tom added in CC: Is there a specific reason why CheckTransactionBlock()
did not include a check based on XACT_FLAGS_PIPELINING when it got
introduced in 20432f873140, while IsInTransactionBlock() considers it?
This also makes LOCK able to work flawlessly within pipelines as far
as I can see, as an error would be reported on HEAD. That's nice.
The behavior of the first command is interesting, still this saves
from the checks after any follow-up commands, and this is intended
from what I can see in postgres.c, because the xact flag is only set
once the first command completes.
Now, here is a fancy case: SAVEPOINT and its two brothers. An error
would be reported on HEAD if attempting a SAVEPOINT, complaining that
we are not in a transaction block. The patch causes a different, more
confusing, failure:
FATAL: DefineSavepoint: unexpected state STARTED
This is a bit user-unfriendly. I am not sure to see the point of
supporting savepoints in this context, so perhaps we should just issue
a cleaner error when we are under a XACT_FLAGS_PIPELINING? Reporting
that we are not in a transaction block, while, well, we are in an
implicit transaction block because of the use of pipelines is
confusing. The new error is actually worse.
--
Michael
On Sat, Nov 2, 2024 at 4:11 AM Michael Paquier <michael@paquier.xyz> wrote:
Now, here is a fancy case: SAVEPOINT and its two brothers. An error
would be reported on HEAD if attempting a SAVEPOINT, complaining that
we are not in a transaction block. The patch causes a different, more
confusing, failure:
FATAL: DefineSavepoint: unexpected state STARTEDThis is a bit user-unfriendly. I am not sure to see the point of
supporting savepoints in this context, so perhaps we should just issue
a cleaner error when we are under a XACT_FLAGS_PIPELINING? Reporting
that we are not in a transaction block, while, well, we are in an
implicit transaction block because of the use of pipelines is
confusing. The new error is actually worse.
There's a possible alternative approach. Instead of checking the
XACT_FLAGS_PIPELINING flag in
CheckTransactionBlock/IsInTransactionBlock/PreventInTransactionBlock,
it is possible to switch the transaction state to
TBLOCK_IMPLICIT_INPROGRESS by reusing BeginImplicitTransactionBlock.
This transaction state is used to represent a transaction block
created by a multi statement query which is executed within the same
transaction, which is very similar to what's done with pipelining.
This allows the removal of the XACT_FLAGS_PIPELINING check in
IsInTransactionBlock and PreventInTransactionBlock since the
transaction state will correctly reflect the ongoing implicit block.
Additionally, it will reuse the same behaviour with regard to
SAVEPOINT and disallow them with a "SAVEPOINT can only be used in
transaction blocks" error.
Attachments:
v02-0001-Consider-pipeline-implicit-transaction-as-a-tran.patchapplication/octet-stream; name=v02-0001-Consider-pipeline-implicit-transaction-as-a-tran.patchDownload+18-14
On Mon, Nov 04, 2024 at 10:42:36AM +0100, Anthonin Bonnefoy wrote:
This allows the removal of the XACT_FLAGS_PIPELINING check in
IsInTransactionBlock and PreventInTransactionBlock since the
transaction state will correctly reflect the ongoing implicit block.
Additionally, it will reuse the same behaviour with regard to
SAVEPOINT and disallow them with a "SAVEPOINT can only be used in
transaction blocks" error.
Ah, interesting. I did not notice this bit in DefineSavepoint(). So
that would be the path that we would bump on for an error as
TBLOCK_IMPLICIT_INPROGRESS would not be set, based on the fact that
it does not really make sense to come back to a previous state while
we are in a pipeline.
It would be nice to document all these behaviors with regression
tests in pgbench as it is the only place where we can control that
with error pattern checks. Let's say:
- LOCK in first position of a pipeline fails, works on follow-up
commands.
- Some command not allowed in transaction blocks, like a REINDEX
CONCURRENTLY or similar?
- One part with the SET LOCAL portion of the problem, where we don't
have a hard error on the first command in this case.
--
Michael
On Tue, Nov 5, 2024 at 6:50 AM Michael Paquier <michael@paquier.xyz> wrote:
It would be nice to document all these behaviors with regression
tests in pgbench as it is the only place where we can control that
with error pattern checks.
It's not the first time I wanted to be able to do pipelining in psql
as relying on pgbench and tap tests is not the most flexible so I took
a stab at adding it.
0001: This is a small bug I've stumbled upon. The query buffer is not
cleared on a backslash error. For example, "SELECT 1 \parse" would
fail due to a missing statement name but would leave "SELECT 1\n" in
the query buffer which would impact the next command.
0002: This adds pipeline support to psql by adding the same
meta-commands as pgbench: \startpipeline, \endpipeline and
\syncpipeline. Once a pipeline is started, all extended protocol
queries are piped until \endpipeline is called. To keep things simple,
meta-commands like \gx, \gdesc and \gexec are forbidden inside a
pipeline. The tests are covering the existing pipelining behaviour
with the SET LOCAL, SAVEPOINTS, REINDEX CONCURRENTLY and LOCK
commands, depending if it's the first command or not. The
\syncpipeline allows us to see that this "first command" behaviour
also happens after a synchronisation point within a pipeline.
0003: The initial change to use implicit transaction block when
pipelining is detected. The tests reflect the change in behaviour like
the LOCK command working if it's the second command in the pipeline.
I've updated the pipeline documentation to provide more details about
the use of implicit transaction blocks with pipelines.
Attachments:
v03-0001-Reset-query-buffer-on-a-backslash-error.patchapplication/octet-stream; name=v03-0001-Reset-query-buffer-on-a-backslash-error.patchDownload+19-5
v03-0003-Consider-pipeline-implicit-transaction-as-a-tran.patchapplication/octet-stream; name=v03-0003-Consider-pipeline-implicit-transaction-as-a-tran.patchDownload+95-17
v03-0002-Add-pipeline-support-in-psql.patchapplication/octet-stream; name=v03-0002-Add-pipeline-support-in-psql.patchDownload+617-5
Some minor changes: I forgot to add the new pipeline meta-commands to
psql's help output, this is now added in 0002. I've also reworded the
doc a bit.
Attachments:
v04-0001-Reset-query-buffer-on-a-backslash-error.patchapplication/octet-stream; name=v04-0001-Reset-query-buffer-on-a-backslash-error.patchDownload+19-5
v04-0002-Add-pipeline-support-in-psql.patchapplication/octet-stream; name=v04-0002-Add-pipeline-support-in-psql.patchDownload+622-5
v04-0003-Consider-pipeline-implicit-transaction-as-a-tran.patchapplication/octet-stream; name=v04-0003-Consider-pipeline-implicit-transaction-as-a-tran.patchDownload+105-21
On Wed, Nov 20, 2024 at 06:03:12PM +0100, Anthonin Bonnefoy wrote:
0001: This is a small bug I've stumbled upon. The query buffer is not
cleared on a backslash error. For example, "SELECT 1 \parse" would
fail due to a missing statement name but would leave "SELECT 1\n" in
the query buffer which would impact the next command.
This breaks an existing property of psql. One example: \edit where we
should keep the existing query buffer rather than discard it if a
failure happens. This patch forcibly removes the contents of the
query buffer, and a failure could happen because of an incorrect
option given to this meta-command.
0002: This adds pipeline support to psql by adding the same
meta-commands as pgbench: \startpipeline, \endpipeline and
\syncpipeline. Once a pipeline is started, all extended protocol
queries are piped until \endpipeline is called. To keep things simple,
meta-commands like \gx, \gdesc and \gexec are forbidden inside a
pipeline. The tests are covering the existing pipelining behaviour
with the SET LOCAL, SAVEPOINTS, REINDEX CONCURRENTLY and LOCK
commands, depending if it's the first command or not. The
\syncpipeline allows us to see that this "first command" behaviour
also happens after a synchronisation point within a pipeline.
That's pretty cool. I've wanted that myself.
This relies on 0001 for the case where \gx fails and you want to make
sure that the follow-up command is able to work in the same pipeline.
I'd suggest to add a \reset after the \gx in this test sequence,
without 0001. At this point we have IMO more advantages in
maintaining the existing properties rather than enforce it, especially
as we have \reset.
+ /*
+ * In pipeline mode, a NULL result is returned to notify the
+ * next query is being processed. We need to consume it and
+ * get the next result.
+ */
+ result = PQgetResult(pset.db);
+ Assert(result == NULL);
+ result = PQgetResult(pset.db);
[...]
next_result = PQgetResult(pset.db);
+ if (process_pipeline && result_status != PGRES_PIPELINE_SYNC)
+ {
+ /*
+ * In pipeline mode, a NULL result indicates the end of the
+ * current query being processed. We need to call PQgetResult a
+ * second time to move to the next response.
+ */
+ Assert(next_result == NULL);
+ next_result = PQgetResult(pset.db);
+ }
I see. This business in ExecQueryAndProcessResults() is consistent
with what pgbench does in discardUntilSync() because of the outer
loop. I found a bit confusing that you had to do this step twice.
Perhaps this could use better comments, but I am not sure what to use
here..
0003: The initial change to use implicit transaction block when
pipelining is detected. The tests reflect the change in behaviour like
the LOCK command working if it's the second command in the pipeline.
I've updated the pipeline documentation to provide more details about
the use of implicit transaction blocks with pipelines.
+-- \startpipeline, \endpipeline, \syncpipeline, \bind and \g are psql specific meta-commands
+\startpipeline
Why is stuff specific to psql being mentioned on the libpq page? That
does not seem adapted to me here. Documenting the expectations and
the assumptions one would rely on in the context of a pipeline and
what the backend thinks of it as an internal transaction is a good
idea.
It is sad to not have at least 2~3 tests that we could use for the
back-branches for the ERROR cases and not the first command cases with
minimal sequences in pgbench? I'm OK with your proposal with psql on
HEAD, but backpatching without a few tests is going to make the
maintenance of stable branches more complicated in the long run as we
would navigate blindly. So I'd try to fix the problems first, then
work on any new proposal once we are in a clean operational state.
And I'd recommend to propose the new feature on a new, separate,
thread to attract a better audience for the reviews this would
require.
--
Michael
Thanks for the review!
On Mon, Nov 25, 2024 at 7:39 AM Michael Paquier <michael@paquier.xyz> wrote:
This breaks an existing property of psql. One example: \edit where we
should keep the existing query buffer rather than discard it if a
failure happens. This patch forcibly removes the contents of the
query buffer, and a failure could happen because of an incorrect
option given to this meta-command.
Good point, I've removed the change.
This relies on 0001 for the case where \gx fails and you want to make
sure that the follow-up command is able to work in the same pipeline.
I'd suggest to add a \reset after the \gx in this test sequence,
without 0001. At this point we have IMO more advantages in
maintaining the existing properties rather than enforce it, especially
as we have \reset.
Definitely, I've updated the test to use \reset.
Why is stuff specific to psql being mentioned on the libpq page? That
does not seem adapted to me here. Documenting the expectations and
the assumptions one would rely on in the context of a pipeline and
what the backend thinks of it as an internal transaction is a good
idea.
I've realised the behavior was already documented in the protocol-flow
doc[1]https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-PIPELINING with an existing mention on the difference between the first
and following commands of a pipeline.
It is sad to not have at least 2~3 tests that we could use for the
back-branches for the ERROR cases and not the first command cases with
minimal sequences in pgbench? I'm OK with your proposal with psql on
HEAD, but backpatching without a few tests is going to make the
maintenance of stable branches more complicated in the long run as we
would navigate blindly. So I'd try to fix the problems first, then
work on any new proposal once we are in a clean operational state.
And I'd recommend to propose the new feature on a new, separate,
thread to attract a better audience for the reviews this would
require.
I've definitely sidetracked myself with the pipeline support in psql.
I've reworked and reordered the patch set to focus on the issue and
backporting.
0001: Use implicit transaction block for the implicit pipeline
transaction. I've added tests in pgbench that covered the same checks
I did in psql. I've avoided using \syncpipeline since it was
introduced in 17. I've also slightly modified the protocol-flow
pipelining doc to provide more details on the implicit transaction
block and how the first message after a sync can be different. I've
added vacuum as first and second command of a pipeline in the tests
since it was mentioned in the doc.
0002: Pipeline support in psql. I'm still attaching the patch here as
it makes testing pipelining behavior much easier but this is more as a
reference. I will create a dedicated thread and commitfest entry for
it.
[1]: https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-PIPELINING
Attachments:
v05-0002-Add-pipeline-support-in-psql.patchapplication/octet-stream; name=v05-0002-Add-pipeline-support-in-psql.patchDownload+685-5
v05-0001-Consider-pipeline-implicit-transaction-as-a-tran.patchapplication/octet-stream; name=v05-0001-Consider-pipeline-implicit-transaction-as-a-tran.patchDownload+173-24
On Mon, Nov 25, 2024 at 02:35:25PM +0100, Anthonin Bonnefoy wrote:
0001: Use implicit transaction block for the implicit pipeline
transaction. I've added tests in pgbench that covered the same checks
I did in psql. I've avoided using \syncpipeline since it was
introduced in 17. I've also slightly modified the protocol-flow
pipelining doc to provide more details on the implicit transaction
block and how the first message after a sync can be different. I've
added vacuum as first and second command of a pipeline in the tests
since it was mentioned in the doc.
That looks acceptable and backpatchable to me. It is a bit sad that
pgbench does not capture the backend errors so as the output of the
tests is more predictable, but this is not new and I'm OK to expand
that with your proposal based on psql in v5-0002 with these new
meta-commands.
The patch provides positive and negative tests for REINDEX, VACUUM,
LOCK and SET LOCAL, which should be plenty enough. There is also a
negative check for subtransactions, which should never happen in
implicit transaction blocks. This last one is a must-have, thanks.
You are right to not use \syncpipeline for the base fix, still I am
tempted to add one extra test with a sequence like this one for only
one of the commands, just to check that the implicit transaction state
is enforced after a sync. Say this one with a WARNING check in the
output:
\startpipeline
SELECT 1;
\syncpipeline
SET LOCAL foo = bar;
\endpipeline
Tweaks of the tests across multiple stable branches happen all the
time, and adding one specific to 17~ is no big issue. I'm in the
middle of it but I'm lacking the steam to do so today. Will likely
finish tomorrow.
--
Michael
On Tue, Nov 26, 2024 at 04:24:58PM +0900, Michael Paquier wrote:
Tweaks of the tests across multiple stable branches happen all the
time, and adding one specific to 17~ is no big issue. I'm in the
middle of it but I'm lacking the steam to do so today. Will likely
finish tomorrow.
I've edited the whole, added this extra test based on \syncpipeline in
17~, kept the remaining tests in 14~ where pgbench is able to handle
them, and backpatched that down to 13. Let's see now what we can do
with the psql bits.
Anthonin, now that the original problem is solved, could you create a
new thread with your new proposal for psql? That would attract a
better audience for reviews.
--
Michael
On Wed, Nov 27, 2024 at 1:42 AM Michael Paquier <michael@paquier.xyz> wrote:
I've edited the whole, added this extra test based on \syncpipeline in
17~, kept the remaining tests in 14~ where pgbench is able to handle
them, and backpatched that down to 13. Let's see now what we can do
with the psql bits.
Thanks!
Anthonin, now that the original problem is solved, could you create a
new thread with your new proposal for psql? That would attract a
better audience for reviews.
Yes, I will rebase and start the dedicated thread for the pipeline
support in psql.
On Wed, 27 Nov 2024 at 01:42, Michael Paquier <michael@paquier.xyz> wrote:
I've edited the whole, added this extra test based on \syncpipeline in
17~, kept the remaining tests in 14~ where pgbench is able to handle
them, and backpatched that down to 13. Let's see now what we can do
with the psql bits.
FYI: it turns out this change broke one of the tests on our pg_duckdb
repo[1]https://github.com/duckdb/pg_duckdb/actions/runs/12052926038/job/33607381526?pr=453#step:15:51 because the error message that PreventInTranasctionBlock
throws is now different:
E AssertionError: Regex pattern did not match.
E Regex: 'DuckDB queries cannot be executed within a pipeline'
E Input: 'DuckDB queries cannot run inside a transaction block'
I personally don't think that's particularly bad, or revert-worthy,
but the previous error was a bit clearer IMO. I don't see how we can
still show it with the new code though.
[1]: https://github.com/duckdb/pg_duckdb/actions/runs/12052926038/job/33607381526?pr=453#step:15:51
On Tue, Nov 26, 2024 at 7:42 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Nov 26, 2024 at 04:24:58PM +0900, Michael Paquier wrote:
Tweaks of the tests across multiple stable branches happen all the
time, and adding one specific to 17~ is no big issue. I'm in the
middle of it but I'm lacking the steam to do so today. Will likely
finish tomorrow.I've edited the whole, added this extra test based on \syncpipeline in
17~, kept the remaining tests in 14~ where pgbench is able to handle
them, and backpatched that down to 13. Let's see now what we can do
with the psql bits.
I'm very surprised that this was back-patched. I think we should
revert it from the back-branches before it gets into a minor release.
It seems like a clear definitional change, which has no business in a
minor release.
--
Robert Haas
EDB: http://www.enterprisedb.com
On 2024-11-27 15:41:14 -0500, Robert Haas wrote:
On Tue, Nov 26, 2024 at 7:42 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Nov 26, 2024 at 04:24:58PM +0900, Michael Paquier wrote:
Tweaks of the tests across multiple stable branches happen all the
time, and adding one specific to 17~ is no big issue. I'm in the
middle of it but I'm lacking the steam to do so today. Will likely
finish tomorrow.I've edited the whole, added this extra test based on \syncpipeline in
17~, kept the remaining tests in 14~ where pgbench is able to handle
them, and backpatched that down to 13. Let's see now what we can do
with the psql bits.I'm very surprised that this was back-patched. I think we should
revert it from the back-branches before it gets into a minor release.
It seems like a clear definitional change, which has no business in a
minor release.
+1
Robert Haas <robertmhaas@gmail.com> writes:
I'm very surprised that this was back-patched. I think we should
revert it from the back-branches before it gets into a minor release.
It seems like a clear definitional change, which has no business in a
minor release.
I was troubled by that too. Maybe this can be painted as a bug fix
but it seems very questionable --- and even if it is, is it worth
the risk of unexpected side-effects? I'd rather see something that
touches wire-protocol behavior go through a normal beta test cycle.
regards, tom lane
On Wed, Nov 27, 2024 at 03:54:24PM -0500, Tom Lane wrote:
Robert Haas <robertmhaas@gmail.com> writes:
I'm very surprised that this was back-patched. I think we should
revert it from the back-branches before it gets into a minor release.
It seems like a clear definitional change, which has no business in a
minor release.I was troubled by that too. Maybe this can be painted as a bug fix
but it seems very questionable --- and even if it is, is it worth
the risk of unexpected side-effects? I'd rather see something that
touches wire-protocol behavior go through a normal beta test cycle.
Yeah, I was surprised too, even though the author was clear they wanted
to backpatch. I couldn't figure out why it was being backpatched, so I
figured I was missing something.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
When a patient asks the doctor, "Am I going to die?", he means
"Am I going to die soon?"
On Wed, Nov 27, 2024 at 04:20:10PM -0500, Bruce Momjian wrote:
Yeah, I was surprised too, even though the author was clear they wanted
to backpatch. I couldn't figure out why it was being backpatched, so I
figured I was missing something.
The set of inconsistencies of how we decide if one command or the
other should behave differently depending on the use of a pipeline,
on the transaction block state, on the code paths involved and on
MyXactFlags in this case qualified as good enough to me for a
backpatch, in the same lines as 20432f873140 & friends. Also worth
noting is the lack of regression tests back then, some could have been
introduced through pgbench's meta-commands as Anthonin has done here
to provide some combination checks even if the error messages
generated cannot be directly looked at. pgbench was mentioned on the
original thread leading to this commit and the result not include
anything. Anyway, what's done is done..
I don't mind being more careful here based on your concerns, so I'll
go remove that in the stable branches.
--
Michael
On Thu, Nov 28, 2024 at 12:26 AM Michael Paquier <michael@paquier.xyz> wrote:
I don't mind being more careful here based on your concerns, so I'll
go remove that in the stable branches.
Sorry about that. I didn't have a strong need for this to be
backpatched and should have made this clearer.