"cancelling statement due to user request error" occurs but the transaction has committed.

Started by Naoya Anzaialmost 12 years ago22 messageshackers
Jump to latest
#1Naoya Anzai
anzai-naoya@mxu.nes.nec.co.jp

Hi All,

When log_duration is true ( or log_min_duration_statement>=0 ),
If a transaction has internally been commited receives a SIGINT signal
then a query cancellation error is output.

For example,
1. A query like a TRUNCATE is removing bigger table files.
2. The session receives SIGINT signal.
3. Query cancellation error occurs.
4. But the query has commited.

e.g.)
---
naoya=# \d
List of relations
Schema | Name | Type | Owner
--------+------+-------+-------
public | hoge | table | naoya
(1 row)

naoya=# set log_duration=on;
SET
naoya=# select count(*) from hoge;
count
--------
100000
(1 row)

naoya=# truncate hoge;
Cancel request sent
ERROR: canceling statement due to user request
naoya=# select count(*) from hoge;
count
-------
0
(1 row)
---

This is because ProcessInterrupts function is called by errfinish ( in query-duration ereport).

I think this cancellation request must not interrupt the internal commited transaction.

This is because clients may misunderstand "the transaction has rollbacked".

Now,
I tried to fix the problem.

--- postgresql-fe7337f/src/backend/utils/error/elog.c	2014-06-06 11:57:44.000000000 +0900
+++ postgresql-fe7337f.new/src/backend/utils/error/elog.c	2014-06-06 13:10:51.000000000 +0900
@@ -580,7 +580,8 @@
 	 * can stop a query emitting tons of notice or warning messages, even if
 	 * it's in a loop that otherwise fails to check for interrupts.
 	 */
-	CHECK_FOR_INTERRUPTS();
+	if (IsTransactionState()) 
+		CHECK_FOR_INTERRUPTS();
 }

Thereby,
When ereport(non error level) calls and not in-transaction state,
PostgreSQL never calls ProcessInterrupts function by errfinish.

But I have a anxiety to fix errfinish function because
errfinish is called in many many situations..

Could you please confirm it?

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-naoya@mxu.nes.nec.co.jp
---

Attachments:

postgresql-fe7337f_elog.patchapplication/octet-stream; name=postgresql-fe7337f_elog.patchDownload+2-1
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Naoya Anzai (#1)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Fri, Jun 6, 2014 at 2:11 PM, Naoya Anzai <anzai-naoya@mxu.nes.nec.co.jp>
wrote:

Hi All,

When log_duration is true ( or log_min_duration_statement>=0 ),
If a transaction has internally been commited receives a SIGINT signal
then a query cancellation error is output.

For example,
1. A query like a TRUNCATE is removing bigger table files.
2. The session receives SIGINT signal.
3. Query cancellation error occurs.
4. But the query has commited.

naoya=# truncate hoge;
Cancel request sent
ERROR: canceling statement due to user request
naoya=# select count(*) from hoge;
count
-------
0
(1 row)
---

This is because ProcessInterrupts function is called by errfinish ( in

query-duration ereport).

I think this cancellation request must not interrupt the internal

commited transaction.

This is because clients may misunderstand "the transaction has

rollbacked".

There can be similar observation if the server goes off (power
outage or anything like) after committing transaction, client will
receive connection broken, so he can misunderstand that as well.
I think for such corner cases, client needs to reconfirm his action
results with database before concluding anything.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#3Naoya Anzai
anzai-naoya@mxu.nes.nec.co.jp
In reply to: Amit Kapila (#2)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Hi Amit,
Thank you for your response.

There can be similar observation if the server goes off (power
outage or anything like) after committing transaction, client will
receive connection broken, so he can misunderstand that as well.
I think for such corner cases, client needs to reconfirm his action
results with database before concluding anything.

I see.
Now, I understand that ProcessInterrupts Error (ProcDie, QueryCancel, ClientLost..) does not mean "That query has been RollBacked".

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-naoya@mxu.nes.nec.co.jp
---

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#2)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Sun, Jun 8, 2014 at 2:52 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I think this cancellation request must not interrupt the internal commited
transaction.

This is because clients may misunderstand "the transaction has
rollbacked".

There can be similar observation if the server goes off (power
outage or anything like) after committing transaction, client will
receive connection broken, so he can misunderstand that as well.
I think for such corner cases, client needs to reconfirm his action
results with database before concluding anything.

I don't agree with this analysis. If the connection is closed after
the client sends a COMMIT and before it gets a response, then the
client must indeed be smart enough to figure out whether or not the
commit happened. But if the server sends a response, the client
should be able to rely on that response being correct. In this case,
an ERROR is getting sent but the transaction is getting committed;
yuck. I'm not sure whether the fix is right, but this definitely
seems like a bug.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#4)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Robert Haas <robertmhaas@gmail.com> writes:

I don't agree with this analysis. If the connection is closed after
the client sends a COMMIT and before it gets a response, then the
client must indeed be smart enough to figure out whether or not the
commit happened. But if the server sends a response, the client
should be able to rely on that response being correct. In this case,
an ERROR is getting sent but the transaction is getting committed;
yuck. I'm not sure whether the fix is right, but this definitely
seems like a bug.

In general, the only way to avoid that sort of behavior for a post-commit
error would be to PANIC ... and even then, the transaction got committed,
which might not be the expectation of a client that got an error message,
even if it said PANIC. So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit. We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

So from this standpoint, redefining SIGINT as not throwing an error when
we're in post-commit seems like a good idea. I'm not endorsing any
details of the patch here, but the 20000-foot view seems generally sound.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#4)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Robert Haas <robertmhaas@gmail.com> wrote:

If the connection is closed after the client sends a COMMIT and
before it gets a response, then the client must indeed be smart
enough to figure out whether or not the commit happened.  But if
the server sends a response, the client should be able to rely on
that response being correct.  In this case, an ERROR is getting
sent but the transaction is getting committed; yuck.  I'm not
sure whether the fix is right, but this definitely seems like a
bug.

+1

It is one thing to send a request and experience a crash or loss of
connection before a response is delivered.  You have to consider
that the state of the transaction is indeterminate and needs to be
checked.  But if the client receives a response saying that the
commit was successful, or that the transaction was rolled back,
that had better reflect reality; otherwise it is a clear bug.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#5)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I don't agree with this analysis. If the connection is closed after
the client sends a COMMIT and before it gets a response, then the
client must indeed be smart enough to figure out whether or not the
commit happened. But if the server sends a response, the client
should be able to rely on that response being correct. In this case,
an ERROR is getting sent but the transaction is getting committed;
yuck. I'm not sure whether the fix is right, but this definitely
seems like a bug.

In general, the only way to avoid that sort of behavior for a post-commit
error would be to PANIC ... and even then, the transaction got committed,
which might not be the expectation of a client that got an error message,
even if it said PANIC. So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit. We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

We could treated it as a lost-communication scenario. The appropriate
recovery actions from the client's point of view are identical.

So from this standpoint, redefining SIGINT as not throwing an error when
we're in post-commit seems like a good idea. I'm not endorsing any
details of the patch here, but the 20000-foot view seems generally sound.

Cool, that makes sense to me also.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

... So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit. We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

We could treated it as a lost-communication scenario. The appropriate
recovery actions from the client's point of view are identical.

I'd hardly rate that as an attractive option.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#8)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Tue, Jun 10, 2014 at 10:42 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

... So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit. We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

We could treated it as a lost-communication scenario. The appropriate
recovery actions from the client's point of view are identical.

I'd hardly rate that as an attractive option.

Well, the only other principled fix I can see is to add a new reponse
along the lines of ERRORBUTITCOMMITTED, which does not seem attractive
either, since all clients will have to be taught to understand it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Naoya Anzai
anzai-naoya@mxu.nes.nec.co.jp
In reply to: Robert Haas (#9)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Hi,

Well, the only other principled fix I can see is to add a new reponse
along the lines of ERRORBUTITCOMMITTED, which does not seem attractive
either, since all clients will have to be taught to understand it.

+1

I think current specification hard to understand for many users.
It is really good if PostgreSQL gave us a message such as a replication abort warning:
###
WARNING: canceling wait for synchronous replication due to user request
DETAIL: The transaction has already committed locally, but might not have been replicated to the standby.
###

Regards,

Naoya

---
Naoya Anzai
Engineering Department
NEC Solution Inovetors, Ltd.
E-Mail: anzai-naoya@mxu.nes.nec.co.jp
---

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#7)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Tue, Jun 10, 2014 at 10:30:24AM -0400, Robert Haas wrote:

On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I don't agree with this analysis. If the connection is closed after
the client sends a COMMIT and before it gets a response, then the
client must indeed be smart enough to figure out whether or not the
commit happened. But if the server sends a response, the client
should be able to rely on that response being correct. In this case,
an ERROR is getting sent but the transaction is getting committed;
yuck. I'm not sure whether the fix is right, but this definitely
seems like a bug.

In general, the only way to avoid that sort of behavior for a post-commit
error would be to PANIC ... and even then, the transaction got committed,
which might not be the expectation of a client that got an error message,
even if it said PANIC. So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit. We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

We could treated it as a lost-communication scenario. The appropriate
recovery actions from the client's point of view are identical.

So from this standpoint, redefining SIGINT as not throwing an error when
we're in post-commit seems like a good idea. I'm not endorsing any
details of the patch here, but the 20000-foot view seems generally sound.

Cool, that makes sense to me also.

Did we ever do anything about this?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#11)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Wed, Sep 10, 2014 at 08:10:45PM -0400, Bruce Momjian wrote:

On Tue, Jun 10, 2014 at 10:30:24AM -0400, Robert Haas wrote:

On Tue, Jun 10, 2014 at 10:18 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I don't agree with this analysis. If the connection is closed after
the client sends a COMMIT and before it gets a response, then the
client must indeed be smart enough to figure out whether or not the
commit happened. But if the server sends a response, the client
should be able to rely on that response being correct. In this case,
an ERROR is getting sent but the transaction is getting committed;
yuck. I'm not sure whether the fix is right, but this definitely
seems like a bug.

In general, the only way to avoid that sort of behavior for a post-commit
error would be to PANIC ... and even then, the transaction got committed,
which might not be the expectation of a client that got an error message,
even if it said PANIC. So this whole area is a minefield, and the only
attractive thing we can do is to try to reduce the number of errors that
can get thrown post-commit. We already, for example, do not treat
post-commit file unlink failures as ERROR, though we surely would prefer
to do that.

We could treated it as a lost-communication scenario. The appropriate
recovery actions from the client's point of view are identical.

So from this standpoint, redefining SIGINT as not throwing an error when
we're in post-commit seems like a good idea. I'm not endorsing any
details of the patch here, but the 20000-foot view seems generally sound.

Cool, that makes sense to me also.

Did we ever do anything about this?

I have researched this issue originally reported in June of 2014 and
implemented a patch to ignore cancel while we are completing a commit.
I am not clear if this is the proper place for this code, though a
disable_timeout() call on the line above suggests I am close. :-)
(The disable_timeout disables internal timeouts, but it doesn't disable
cancels coming from the client.)

The first patch is for testing and adds a sleep(5) to the end of the
TRUNCATE command, to give the tester time to press Control-C from psql,
and enables log_duration so the cancel is checked.

The second patch is the patch that disables cancel when we are in the
process of committing; before:

test=> CREATE TABLE test(x INT);
CREATE TABLE
test=> INSERT INTO test VALUES (3);
INSERT 0 1
test=> TRUNCATE test;
^CCancel request sent
--> ERROR: canceling statement due to user request
test=> SELECT * FROM test;
x
---
(0 rows)

and with both patches:

test=> CREATE TABLE test(x INT);
CREATE TABLE
test=> INSERT INTO test VALUES (3);
INSERT 0 1
test=> TRUNCATE test;
^CCancel request sent
--> TRUNCATE TABLE
test=> SELECT * FROM test;
x
---
(0 rows)

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

Attachments:

sleep.difftext/x-diff; charset=us-asciiDownload+8-6
cancel.difftext/x-diff; charset=us-asciiDownload+6-0
#13Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#12)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Wed, Mar 18, 2015 at 10:56 PM, Bruce Momjian <bruce@momjian.us> wrote:

I have researched this issue originally reported in June of 2014 and
implemented a patch to ignore cancel while we are completing a commit.
I am not clear if this is the proper place for this code, though a
disable_timeout() call on the line above suggests I am close. :-)

This would also disable cancel interrupts while running AFTER
triggers, which seems almost certain to be wrong. TBH, I'm not sure
why the existing HOLD_INTERRUPTS() in CommitTransaction() isn't
already preventing this problem. Did you investigate that at all?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#13)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Thu, Mar 19, 2015 at 07:54:02AM -0400, Robert Haas wrote:

On Wed, Mar 18, 2015 at 10:56 PM, Bruce Momjian <bruce@momjian.us> wrote:

I have researched this issue originally reported in June of 2014 and
implemented a patch to ignore cancel while we are completing a commit.
I am not clear if this is the proper place for this code, though a
disable_timeout() call on the line above suggests I am close. :-)

This would also disable cancel interrupts while running AFTER
triggers, which seems almost certain to be wrong. TBH, I'm not sure
why the existing HOLD_INTERRUPTS() in CommitTransaction() isn't
already preventing this problem. Did you investigate that at all?

Yes, the situation is complex, and was suggested by the original poster.
The issue with CommitTransaction() is that it only _holds_ the signal
--- it doesn't clear it.  Now, since there are very few
CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
signal is normally erased.  However, if log_duration or
log_min_duration_statement are set, they call ereport, which calls
errfinish(), which has a call to CHECK_FOR_INTERRUPTS().  

First attached patch is more surgical and clears a possible cancel
request before we report the query duration in the logs --- this doesn't
affect any after triggers that might include CHECK_FOR_INTERRUPTS()
calls we want to honor.

Another approach would be to have CommitTransaction() clear any pending
cancel before it calls RESUME_INTERRUPTS(). The second attached patch
takes that approach, and also works.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

Attachments:

cancel2.difftext/x-diff; charset=us-asciiDownload+6-0
cancel3.difftext/x-diff; charset=us-asciiDownload+6-0
#15Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#14)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Mar 19, 2015 at 07:54:02AM -0400, Robert Haas wrote:

On Wed, Mar 18, 2015 at 10:56 PM, Bruce Momjian <bruce@momjian.us> wrote:

I have researched this issue originally reported in June of 2014 and
implemented a patch to ignore cancel while we are completing a commit.
I am not clear if this is the proper place for this code, though a
disable_timeout() call on the line above suggests I am close. :-)

This would also disable cancel interrupts while running AFTER
triggers, which seems almost certain to be wrong. TBH, I'm not sure
why the existing HOLD_INTERRUPTS() in CommitTransaction() isn't
already preventing this problem. Did you investigate that at all?

Yes, the situation is complex, and was suggested by the original poster.
The issue with CommitTransaction() is that it only _holds_ the signal
--- it doesn't clear it.  Now, since there are very few
CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
signal is normally erased.  However, if log_duration or
log_min_duration_statement are set, they call ereport, which calls
errfinish(), which has a call to CHECK_FOR_INTERRUPTS().

First attached patch is more surgical and clears a possible cancel
request before we report the query duration in the logs --- this doesn't
affect any after triggers that might include CHECK_FOR_INTERRUPTS()
calls we want to honor.

Another approach would be to have CommitTransaction() clear any pending
cancel before it calls RESUME_INTERRUPTS(). The second attached patch
takes that approach, and also works.

So, either way, what happens if the query cancel shows up just an
instant after you clear the flag?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#15)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Robert Haas wrote:

On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:

The issue with CommitTransaction() is that it only _holds_ the signal
--- it doesn't clear it.  Now, since there are very few
CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
signal is normally erased.  However, if log_duration or
log_min_duration_statement are set, they call ereport, which calls
errfinish(), which has a call to CHECK_FOR_INTERRUPTS().

First attached patch is more surgical and clears a possible cancel
request before we report the query duration in the logs --- this doesn't
affect any after triggers that might include CHECK_FOR_INTERRUPTS()
calls we want to honor.

Another approach would be to have CommitTransaction() clear any pending
cancel before it calls RESUME_INTERRUPTS(). The second attached patch
takes that approach, and also works.

So, either way, what happens if the query cancel shows up just an
instant after you clear the flag?

I don't understand why aren't interrupts held until after the commit is
done -- including across the mentioned ereports.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#15)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Thu, Mar 19, 2015 at 04:36:38PM -0400, Robert Haas wrote:

On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:

First attached patch is more surgical and clears a possible cancel
request before we report the query duration in the logs --- this doesn't
affect any after triggers that might include CHECK_FOR_INTERRUPTS()
calls we want to honor.

Another approach would be to have CommitTransaction() clear any pending
cancel before it calls RESUME_INTERRUPTS(). The second attached patch
takes that approach, and also works.

So, either way, what happens if the query cancel shows up just an
instant after you clear the flag?

Oh, good point. This version handles that case addressing only the
log_duration* block.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

Attachments:

cancel4.difftext/x-diff; charset=us-asciiDownload+34-35
#18Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#16)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

On Thu, Mar 19, 2015 at 06:59:20PM -0300, Alvaro Herrera wrote:

Robert Haas wrote:

On Thu, Mar 19, 2015 at 10:23 AM, Bruce Momjian <bruce@momjian.us> wrote:

The issue with CommitTransaction() is that it only _holds_ the signal
--- it doesn't clear it.  Now, since there are very few
CHECK_FOR_INTERRUPTS() calls in the typical commit process flow, the
signal is normally erased.  However, if log_duration or
log_min_duration_statement are set, they call ereport, which calls
errfinish(), which has a call to CHECK_FOR_INTERRUPTS().

First attached patch is more surgical and clears a possible cancel
request before we report the query duration in the logs --- this doesn't
affect any after triggers that might include CHECK_FOR_INTERRUPTS()
calls we want to honor.

Another approach would be to have CommitTransaction() clear any pending
cancel before it calls RESUME_INTERRUPTS(). The second attached patch
takes that approach, and also works.

So, either way, what happens if the query cancel shows up just an
instant after you clear the flag?

I don't understand why aren't interrupts held until after the commit is
done -- including across the mentioned ereports.

Uh, I think Robert was thinking of pre-commit triggers at the top of
CommitTransaction() that might take a long time and we might want to
cancel. In fact, he is right that mid-way into CommitTransaction(),
after those pre-commit triggers, we do HOLD_INTERRUPTS(), then set our
clog bit and continue to the bottom of that function. What is happening
is that we don't _clear_ the cancel bit and log_duration is finding the
cancel.

In an ideal world, we would clear the client cancel in
CommitTransaction() and when we do log_duration*, and the attached patch
now does that.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

Attachments:

cancel5.difftext/x-diff; charset=us-asciiDownload+38-36
#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#17)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Mar 19, 2015 at 04:36:38PM -0400, Robert Haas wrote:

So, either way, what happens if the query cancel shows up just an
instant after you clear the flag?

Oh, good point. This version handles that case addressing only the
log_duration* block.

This is just moving the failure cases around, and not by very much either.

The core issue here, I think, is that xact.c only holds off interrupts
during what it considers to be the commit critical section --- which is
okay from the standpoint of transactional consistency. But the complaint
has to do with what the client perceives to have happened if a SIGINT
arrives somewhere between where xact.c has committed and where postgres.c
has reported the commit to the client. If we want to address that, I
think postgres.c needs to hold off interrupts for the entire duration from
just before CommitTransactionCommand() to just after ReadyForQuery().
That's likely to be rather messy, because there are so many code paths
there, especially when you consider error cases.

A possible way to do this without incurring unacceptably high risk of
breakage (in particular, ending up with interrupts still held off when
they shouldn't be any longer) is to assume that there should never be a
case where we reach ReadCommand() with interrupts still held off. Then
we could invent an additional interrupt primitive "RESET_INTERRUPTS()"
that forces InterruptHoldoffCount to zero (and, presumably, then does
a CHECK_FOR_INTERRUPTS()); then putting a HOLD_INTERRUPTS() before calling
CommitTransactionCommand() and a RESET_INTERRUPTS() before waiting for
client input would presumably be pretty safe. On the other hand, that
approach could easily mask interrupt holdoff mismatch bugs elsewhere in
the code base.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#18)
Re: "cancelling statement due to user request error" occurs but the transaction has committed.

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Mar 19, 2015 at 06:59:20PM -0300, Alvaro Herrera wrote:

I don't understand why aren't interrupts held until after the commit is
done -- including across the mentioned ereports.

Uh, I think Robert was thinking of pre-commit triggers at the top of
CommitTransaction() that might take a long time and we might want to
cancel.

Yeah, that's a good point. So really the only way to make this work as
requested is to have some cooperation between xact.c and postgres.c,
so that the hold taken midway through CommitTransaction is kept until
we reach the idle point.

The attached is only very lightly tested but shows what we probably
would need for this. It's a bit messy in that the API for
CommitTransactionCommand leaves it unspecified whether interrupts are
held at exit; I'm not sure if it's useful or feasible to be more precise.

regards, tom lane

Attachments:

prevent-post-commit-interrupts-1.patchtext/x-diff; charset=us-ascii; name=prevent-post-commit-interrupts-1.patchDownload+155-135
#21Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#19)
#22Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#20)