Hot Standy introduced problem with query cancel behavior

Started by Kris Jurkaabout 16 years ago49 messages
#1Kris Jurka
books@ejurka.com
1 attachment(s)

The JDBC driver's regression test suite has revealed a change in behavior
introduced by the hot standy patch. Previously when a client sent a
cancel request on an idle connection, nothing happened. Now it sends an
error message "ERROR: canceling statement due to user request". This
confuses the driver's protocol handling and it thinks that the error
message is for the next statement issued.

Attached is a test case.

Kris Jurka

Attachments:

Cancel.javatext/plain; charset=US-ASCII; name=Cancel.javaDownload
#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Kris Jurka (#1)
Re: Hot Standy introduced problem with query cancel behavior

On Sat, 2009-12-26 at 20:15 -0500, Kris Jurka wrote:

The JDBC driver's regression test suite has revealed a change in behavior
introduced by the hot standy patch. Previously when a client sent a
cancel request on an idle connection, nothing happened. Now it sends an
error message "ERROR: canceling statement due to user request". This
confuses the driver's protocol handling and it thinks that the error
message is for the next statement issued.

Attached is a test case.

Thanks for the report. I'll see about a fix.

--
Simon Riggs www.2ndQuadrant.com

#3Joachim Wieland
joe@mcknight.de
In reply to: Simon Riggs (#2)
Re: Hot Standy introduced problem with query cancel behavior

On Sun, Dec 27, 2009 at 10:42 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Thanks for the report. I'll see about a fix.

In the end we are about to use SIGINT for two use cases:

- cancel an idle transaction
- cancel a running query

Previously a backend that was DoingCommandRead == true didn't do
anything upon reception of SIGINT, now it aborts either the running
query or the idle transaction, which is why Kris's example behaves
differently now.

If we use the same signal for both cases, the receiving backend cannot
tell what the intention of the sending backend was. That's why I
proposed to make SIGINT similar to SIGUSR1 where we write a reason to
a shared memory structure first and then send the signal (see
http://archives.postgresql.org/pgsql-hackers/2009-12/msg02067.php from
a few days ago).

There was also some dicussion about how to communicate the
cancellation back to the client when its idle transaction got aborted.
I implemented what I thought was the conclusion of the discussion but
haven't received a reply on it yet.

Joachim

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joachim Wieland (#3)
Re: Hot Standy introduced problem with query cancel behavior

Joachim Wieland <joe@mcknight.de> writes:

If we use the same signal for both cases, the receiving backend cannot
tell what the intention of the sending backend was. That's why I
proposed to make SIGINT similar to SIGUSR1 where we write a reason to
a shared memory structure first and then send the signal (see
http://archives.postgresql.org/pgsql-hackers/2009-12/msg02067.php from
a few days ago).

This seems like a fairly bad idea. One of the intended use-cases is to
be able to manually "kill -INT" a misbehaving backend. Assuming that
there will be valid info about the signal in shared memory will break
that.

regards, tom lane

#5Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#4)
Re: Hot Standy introduced problem with query cancel behavior

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

Joachim Wieland <joe@mcknight.de> writes:

If we use the same signal for both cases, the receiving backend cannot
tell what the intention of the sending backend was. That's why I
proposed to make SIGINT similar to SIGUSR1 where we write a reason to
a shared memory structure first and then send the signal (see
http://archives.postgresql.org/pgsql-hackers/2009-12/msg02067.php from
a few days ago).

This seems like a fairly bad idea. One of the intended use-cases is to
be able to manually "kill -INT" a misbehaving backend. Assuming that
there will be valid info about the signal in shared memory will break
that.

Well. That already is the case now. MyProc->recoveryConflictMode is checked to
recognize what kind of conflict is being resolved...

Andres

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#5)
Re: Hot Standy introduced problem with query cancel behavior

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea. One of the intended use-cases is to
be able to manually "kill -INT" a misbehaving backend. Assuming that
there will be valid info about the signal in shared memory will break
that.

Well. That already is the case now. MyProc->recoveryConflictMode is checked to
recognize what kind of conflict is being resolved...

In that case, HS has already broken it, and we need to fix it not make
it worse.

My humble opinion is that SIGINT should not be overloaded with multiple
meanings. We already have a multiplexed signal mechanism, which is what
should be used for any additional signal reasons HS may need to
introduce.

regards, tom lane

#7Simon Riggs
simon@2ndQuadrant.com
In reply to: Joachim Wieland (#3)
Re: Hot Standy introduced problem with query cancel behavior

On Tue, 2009-12-29 at 14:14 +0100, Joachim Wieland wrote:

haven't received a reply on it yet.

Apologies for that. I replied today, just forgot to hit send until now.

Thanks again for the patch.

--
Simon Riggs www.2ndQuadrant.com

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#6)
Re: Hot Standy introduced problem with query cancel behavior

On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea. One of the intended use-cases is to
be able to manually "kill -INT" a misbehaving backend. Assuming that
there will be valid info about the signal in shared memory will break
that.

Well. That already is the case now. MyProc->recoveryConflictMode is checked to
recognize what kind of conflict is being resolved...

In that case, HS has already broken it, and we need to fix it not make
it worse.

My humble opinion is that SIGINT should not be overloaded with multiple
meanings. We already have a multiplexed signal mechanism, which is what
should be used for any additional signal reasons HS may need to
introduce.

It's a revelation to me, but yes, I see it now and agree.

I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
this code using that mechanism. It sounds like it's a neat fit and it
should get around the bug report from Kris also if it all works.

--
Simon Riggs www.2ndQuadrant.com

#9Joachim Wieland
joe@mcknight.de
In reply to: Tom Lane (#4)
Re: Hot Standy introduced problem with query cancel behavior

On Tue, Dec 29, 2009 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This seems like a fairly bad idea.  One of the intended use-cases is to
be able to manually "kill -INT" a misbehaving backend.  Assuming that
there will be valid info about the signal in shared memory will break
that.

I never intended to change the current behavior.

Actually I wanted to even enhance it by allowing to also cancel an idle
transaction via SIGINT. We are free to define what should happen if there is no
internal reason available because the signal has been sent manually.

We can also use SIGUSR1 of course but then you cannot cancel an idle
transaction just from the commandline. Not sure if this is necessary
though but I would have liked it in the past already (I once used a version of
slony that left transactions idle from time to time...)

Joachim

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Joachim Wieland (#9)
Re: Hot Standy introduced problem with query cancel behavior

On Wed, 2009-12-30 at 12:05 +0100, Joachim Wieland wrote:

On Tue, Dec 29, 2009 at 4:22 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

This seems like a fairly bad idea. One of the intended use-cases is to
be able to manually "kill -INT" a misbehaving backend. Assuming that
there will be valid info about the signal in shared memory will break
that.

I never intended to change the current behavior.

Actually I wanted to even enhance it by allowing to also cancel an idle
transaction via SIGINT. We are free to define what should happen if there is no
internal reason available because the signal has been sent manually.

We can also use SIGUSR1 of course but then you cannot cancel an idle
transaction just from the commandline. Not sure if this is necessary
though but I would have liked it in the past already (I once used a version of
slony that left transactions idle from time to time...)

Andres mentioned this in relation to Startup process sending signals to
backends. Startup needs to in-some cases issue a FATAL error also, which
is a separate issue from the discusion around SIGINT.

I will rework the FATAL case and continue to support you in finding a
solution to the cancel-while-idle case.

--
Simon Riggs www.2ndQuadrant.com

#11Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#8)
Re: Hot Standy introduced problem with query cancel behavior

On Wednesday 30 December 2009 01:13:01 Simon Riggs wrote:

On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea. One of the intended use-cases is
to be able to manually "kill -INT" a misbehaving backend. Assuming
that there will be valid info about the signal in shared memory will
break that.

Well. That already is the case now. MyProc->recoveryConflictMode is
checked to recognize what kind of conflict is being resolved...

In that case, HS has already broken it, and we need to fix it not make
it worse.

My humble opinion is that SIGINT should not be overloaded with multiple
meanings. We already have a multiplexed signal mechanism, which is what
should be used for any additional signal reasons HS may need to
introduce.

It's a revelation to me, but yes, I see it now and agree.

I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
this code using that mechanism. It sounds like it's a neat fit and it
should get around the bug report from Kris also if it all works.

Hm. I just read a bit of that multiplexing facility (out of a different reason)
and I have some doubt about it being used unmodified for canceling backends:

procsignal.c:
/*
* Note: Since there's no locking, it's possible that the target
* process detaches from shared memory and exits right after this
* test, before we set the flag and send signal. And the signal slot
* might even be recycled by a new process, so it's remotely possible
* that we set a flag for a wrong process. That's OK, all the signals
* are such that no harm is done if they're mistakenly fired.
*/
procsignal.h:
...
* Also, because of race conditions, it's important that all the signals be
* defined so that no harm is done if a process mistakenly receives one.
*/

When cancelling a backend that behaviour could be a bit annoying ;-)

I guess locking procarray during sending the signal should be enough?

Andres

#12Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#11)
Re: Hot Standy introduced problem with query cancel behavior

On Wed, Dec 30, 2009 at 2:06 PM, Andres Freund <andres@anarazel.de> wrote:

On Wednesday 30 December 2009 01:13:01 Simon Riggs wrote:

On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea.  One of the intended use-cases is
to be able to manually "kill -INT" a misbehaving backend.  Assuming
that there will be valid info about the signal in shared memory will
break that.

Well. That already is the case now. MyProc->recoveryConflictMode is
checked to recognize what kind of conflict is being resolved...

In that case, HS has already broken it, and we need to fix it not make
it worse.

My humble opinion is that SIGINT should not be overloaded with multiple
meanings.  We already have a multiplexed signal mechanism, which is what
should be used for any additional signal reasons HS may need to
introduce.

It's a revelation to me, but yes, I see it now and agree.

I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
this code using that mechanism. It sounds like it's a neat fit and it
should get around the bug report from Kris also if it all works.

Hm. I just read a bit of that multiplexing facility (out of a different reason)
and I have some doubt about it being used unmodified for canceling backends:

procsignal.c:
/*
 * Note: Since there's no locking, it's possible that the target
 * process detaches from shared memory and exits right after this
 * test, before we set the flag and send signal. And the signal slot
 * might even be recycled by a new process, so it's remotely possible
 * that we set a flag for a wrong process. That's OK, all the signals
 * are such that no harm is done if they're mistakenly fired.
 */
procsignal.h:
...
 * Also, because of race conditions, it's important that all the signals be
 * defined so that no harm is done if a process mistakenly receives one.
 */

When cancelling a backend that behaviour could be a bit annoying ;-)

I guess locking procarray during sending the signal should be enough?

I think the idea is that you define the behavior of the signal to be
"look at this other piece of state to see whether you should cancel
yourself" rather than just "cancel yourself". Then if a signal is
delivered by mistake, it's no big deal - you just look at the other
piece of state and decide that you don't need to do anything.

...Robert

#13Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#12)
Re: Hot Standy introduced problem with query cancel behavior

----- Ursprüngliche Mitteilung -----

On Wed, Dec 30, 2009 at 2:06 PM, Andres Freund <andres@anarazel.de> wrote:

On Wednesday 30 December 2009 01:13:01 Simon Riggs wrote:

On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea.  One of the intended use-cases is
to be able to manually "kill -INT" a misbehaving backend.  Assuming
that there will be valid info about the signal in shared memory will
break that.

Well. That already is the case now. MyProc->recoveryConflictMode is
checked to recognize what kind of conflict is being resolved...

In that case, HS has already broken it, and we need to fix it not make
it worse.

My humble opinion is that SIGINT should not be overloaded with multiple
meanings.  We already have a multiplexed signal mechanism, which is what
should be used for any additional signal reasons HS may need to
introduce.

It's a revelation to me, but yes, I see it now and agree.

I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
this code using that mechanism. It sounds like it's a neat fit and it
should get around the bug report from Kris also if it all works.

Hm. I just read a bit of that multiplexing facility (out of a different reason)
and I have some doubt about it being used unmodified for canceling backends:

procsignal.c:
/*
 * Note: Since there's no locking, it's possible that the target
 * process detaches from shared memory and exits right after this
 * test, before we set the flag and send signal. And the signal slot
 * might even be recycled by a new process, so it's remotely possible
 * that we set a flag for a wrong process. That's OK, all the signals
 * are such that no harm is done if they're mistakenly fired.
 */
procsignal.h:
...
 * Also, because of race conditions, it's important that all the signals be
 * defined so that no harm is done if a process mistakenly receives one.
 */

When cancelling a backend that behaviour could be a bit annoying ;-)

I guess locking procarray during sending the signal should be enough?

I think the idea is that you define the behavior of the signal to be
"look at this other piece of state to see whether you should cancel
yourself" rather than just "cancel yourself".  Then if a signal is
delivered by mistake, it's no big deal - you just look at the other
piece of state and decide that you don't need to do anything.

I dont see an easy way to pass enough information right now. You cant regenerate enough of it in the to be killed backend as most of the relevant information is only available in the startup process.
Inventing yet another segment in shm just for this seems overcomplicated to me.
Thats why I suggested locking the procarray for this - without having looked at the code that should prevent a backend slot from beimg reused.

Andres

#14Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#13)
Re: Hot Standy introduced problem with query cancel behavior

On Wed, Dec 30, 2009 at 6:43 PM, Andres Freund <andres@anarazel.de> wrote:

----- Ursprüngliche Mitteilung -----

On Wed, Dec 30, 2009 at 2:06 PM, Andres Freund <andres@anarazel.de> wrote:

On Wednesday 30 December 2009 01:13:01 Simon Riggs wrote:

On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea.  One of the intended use-cases is
to be able to manually "kill -INT" a misbehaving backend.  Assuming
that there will be valid info about the signal in shared memory will
break that.

Well. That already is the case now. MyProc->recoveryConflictMode is
checked to recognize what kind of conflict is being resolved...

In that case, HS has already broken it, and we need to fix it not make
it worse.

My humble opinion is that SIGINT should not be overloaded with multiple
meanings.  We already have a multiplexed signal mechanism, which is what
should be used for any additional signal reasons HS may need to
introduce.

It's a revelation to me, but yes, I see it now and agree.

I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
this code using that mechanism. It sounds like it's a neat fit and it
should get around the bug report from Kris also if it all works.

Hm. I just read a bit of that multiplexing facility (out of a different reason)
and I have some doubt about it being used unmodified for canceling backends:

procsignal.c:
/*
 * Note: Since there's no locking, it's possible that the target
 * process detaches from shared memory and exits right after this
 * test, before we set the flag and send signal. And the signal slot
 * might even be recycled by a new process, so it's remotely possible
 * that we set a flag for a wrong process. That's OK, all the signals
 * are such that no harm is done if they're mistakenly fired.
 */
procsignal.h:
...
 * Also, because of race conditions, it's important that all the signals be
 * defined so that no harm is done if a process mistakenly receives one.
 */

When cancelling a backend that behaviour could be a bit annoying ;-)

I guess locking procarray during sending the signal should be enough?

I think the idea is that you define the behavior of the signal to be
"look at this other piece of state to see whether you should cancel
yourself" rather than just "cancel yourself".  Then if a signal is
delivered by mistake, it's no big deal - you just look at the other
piece of state and decide that you don't need to do anything.

I dont see an easy way to pass enough information right now. You cant regenerate enough of it in the to be killed backend as most of the relevant information is only available in the startup process.
Inventing yet another segment in shm just for this seems overcomplicated to me.
Thats why I suggested locking the procarray for this - without having looked at the code that should prevent a backend slot from beimg reused.

Yeah, I understand, but I have a feeling that the code doesn't do it
that way right now for a reason. Someone who understands this better
than I should comment, but I'm thinking you would likely need to lock
the ProcArray in CheckProcSignal as well, and I'm thinking that can't
be safely done from within a signal handler.

...Robert

#15Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#14)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 31 December 2009 01:09:57 Robert Haas wrote:

On Wed, Dec 30, 2009 at 6:43 PM, Andres Freund <andres@anarazel.de> wrote:

On Wed, Dec 30, 2009 at 2:06 PM, Andres Freund <andres@anarazel.de>

wrote:

On Wednesday 30 December 2009 01:13:01 Simon Riggs wrote:

On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:

This seems like a fairly bad idea. One of the intended
use-cases is to be able to manually "kill -INT" a misbehaving
backend. Assuming that there will be valid info about the
signal in shared memory will break that.

Well. That already is the case now. MyProc->recoveryConflictMode
is checked to recognize what kind of conflict is being
resolved...

In that case, HS has already broken it, and we need to fix it not
make it worse.

My humble opinion is that SIGINT should not be overloaded with
multiple meanings. We already have a multiplexed signal
mechanism, which is what should be used for any additional signal
reasons HS may need to introduce.

It's a revelation to me, but yes, I see it now and agree.

I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
this code using that mechanism. It sounds like it's a neat fit and
it should get around the bug report from Kris also if it all works.

Hm. I just read a bit of that multiplexing facility (out of a
different reason) and I have some doubt about it being used unmodified
for canceling backends:

procsignal.c:
/*
* Note: Since there's no locking, it's possible that the target
* process detaches from shared memory and exits right after this
* test, before we set the flag and send signal. And the signal slot
* might even be recycled by a new process, so it's remotely possible
* that we set a flag for a wrong process. That's OK, all the signals
* are such that no harm is done if they're mistakenly fired.
*/
procsignal.h:
...
* Also, because of race conditions, it's important that all the
signals be * defined so that no harm is done if a process mistakenly
receives one. */

When cancelling a backend that behaviour could be a bit annoying ;-)

I guess locking procarray during sending the signal should be enough?

I think the idea is that you define the behavior of the signal to be
"look at this other piece of state to see whether you should cancel
yourself" rather than just "cancel yourself". Then if a signal is
delivered by mistake, it's no big deal - you just look at the other
piece of state and decide that you don't need to do anything.

I dont see an easy way to pass enough information right now. You cant
regenerate enough of it in the to be killed backend as most of the
relevant information is only available in the startup process. Inventing
yet another segment in shm just for this seems overcomplicated to me.
Thats why I suggested locking the procarray for this - without having
looked at the code that should prevent a backend slot from beimg reused.

Yeah, I understand, but I have a feeling that the code doesn't do it
that way right now for a reason. Someone who understands this better
than I should comment, but I'm thinking you would likely need to lock
the ProcArray in CheckProcSignal as well, and I'm thinking that can't
be safely done from within a signal handler.

I don't think we would need to lock in the signal handler.
The situation is that two different flags (at this point at least) are needed.
FATAL which aborts the session and ERROR which aborts the transaction.

Consider the following scenario:
- both flag are set while holding a lock on the procarray
- starting a new backend requires a lock on the procarray
- backend startup cleans up both flags in its ProcSignalSlot (only that specific
one as its the only one manipulated under a lock, mind you)
- transaction startup clears the ERROR flag under locking
- after the new backend started no signal handler targeted for the old backend
can get triggered (new pid, if we consider direct reuse of the same pid we
have much bigger problems anyway), thus no flag targeted for the old backend
can get set

That would require a nice comment explaining that but it should work.

Another possibility would be to make the whole signal delivery mechanism safe
- that should be possible if we had a atomically settable backend id...

Unfortunately that would limit the max lifetime for a backend a bit - as
sig_atomic_t is 4byte on most platforms. So no backend would be allowed to
live after creation of the 2**32-1th backend after it.
I don't immediately see a way circumvent that 32bit barrier without using
assembler or locks.

Andres

PS: Hm. For my former message beeing written on a mobile phone it looked
surprisingly clean...

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#11)
Re: Hot Standy introduced problem with query cancel behavior

On Wed, 2009-12-30 at 20:06 +0100, Andres Freund wrote:

Hm. I just read a bit of that multiplexing facility (out of a different reason)
and I have some doubt about it being used unmodified for canceling backends:

procsignal.c:
/*
* Note: Since there's no locking, it's possible that the target
* process detaches from shared memory and exits right after this
* test, before we set the flag and send signal. And the signal slot
* might even be recycled by a new process, so it's remotely possible
* that we set a flag for a wrong process. That's OK, all the signals
* are such that no harm is done if they're mistakenly fired.
*/
procsignal.h:
...
* Also, because of race conditions, it's important that all the signals be
* defined so that no harm is done if a process mistakenly receives one.
*/

When cancelling a backend that behaviour could be a bit annoying ;-)

Reading comments alone doesn't show the full situation here.

Before we send signal we check pid also, so the chances of this
happening are fairly small. i.e. we would need to have a backend slot
reused by a new backend with exactly same pid as the last slot holder.

I'm proposing to use this for killing transactions and connections, so I
don't think there's too much harm done there. If the backend is leaving
anyway, that's what we wanted. If its a new guy that is wearing the same
boots then a little collateral damage doesn't leave the server in a bad
place. HS cancellations aren't yet so precise that this matters.

--
Simon Riggs www.2ndQuadrant.com

#17Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#16)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 31 December 2009 12:25:19 Simon Riggs wrote:

On Wed, 2009-12-30 at 20:06 +0100, Andres Freund wrote:

Hm. I just read a bit of that multiplexing facility (out of a different
reason) and I have some doubt about it being used unmodified for
canceling backends:

procsignal.c:
/*
* Note: Since there's no locking, it's possible that the target
* process detaches from shared memory and exits right after this
* test, before we set the flag and send signal. And the signal slot
* might even be recycled by a new process, so it's remotely possible
* that we set a flag for a wrong process. That's OK, all the signals
* are such that no harm is done if they're mistakenly fired.
*/
procsignal.h:
...
* Also, because of race conditions, it's important that all the signals
be * defined so that no harm is done if a process mistakenly receives
one. */

When cancelling a backend that behaviour could be a bit annoying ;-)

Reading comments alone doesn't show the full situation here.

Before we send signal we check pid also, so the chances of this
happening are fairly small. i.e. we would need to have a backend slot
reused by a new backend with exactly same pid as the last slot holder.

Well. The problem does not occur for critical errors (i.e. session death)
only. As signal delivery is asynchronous it can very well happen for
transaction cancellation as well.

I'm proposing to use this for killing transactions and connections, so I
don't think there's too much harm done there. If the backend is leaving
anyway, that's what we wanted. If its a new guy that is wearing the same
boots then a little collateral damage doesn't leave the server in a bad
place. HS cancellations aren't yet so precise that this matters.

Building racy infrastructure when it can be avoided with a little care still
seems not to be the best path to me.

Andres

#18Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#17)
1 attachment(s)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2009-12-31 at 13:14 +0100, Andres Freund wrote:

When cancelling a backend that behaviour could be a bit

annoying ;-)

Reading comments alone doesn't show the full situation here.

Before we send signal we check pid also, so the chances of this
happening are fairly small. i.e. we would need to have a backend

slot

reused by a new backend with exactly same pid as the last slot

holder.
Well. The problem does not occur for critical errors (i.e. session
death)
only. As signal delivery is asynchronous it can very well happen for
transaction cancellation as well.

I'm proposing to use this for killing transactions and connections,

so I

don't think there's too much harm done there. If the backend is

leaving

anyway, that's what we wanted. If its a new guy that is wearing the

same

boots then a little collateral damage doesn't leave the server in a

bad

place. HS cancellations aren't yet so precise that this matters.

Building racy infrastructure when it can be avoided with a little care
still seems not to be the best path to me.

Doing that will add more complexity in an area that is hard to test
effectively. I think the risk of introducing further bugs while trying
to fix this rare condition is high. Right now the conflict processing
needs more work and is often much less precise than this, so improving
this aspect of it would not be a priority. I've added it to the TODO
though. Thank you for your research.

I enclose the patch I am currently testing, as a patch-on-patch on top
of Joachim's changes, recently published on this thread. POC only as
yet.

Patch implements recovery conflict signalling using SIGUSR1
multiplexing, then uses a SessionCancelPending mode similar to Joachim's
TransactionCancelPending.

--
Simon Riggs www.2ndQuadrant.com

Attachments:

conflict_using_sigusr1.v1.patchtext/x-patch; charset=UTF-8; name=conflict_using_sigusr1.v1.patchDownload
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
***************
*** 1704,1710 **** GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid,
   * Returns pid of the process signaled, or 0 if not found.
   */
  pid_t
! CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
  {
  	ProcArrayStruct *arrayP = procArray;
  	int			index;
--- 1704,1710 ----
   * Returns pid of the process signaled, or 0 if not found.
   */
  pid_t
! CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
  {
  	ProcArrayStruct *arrayP = procArray;
  	int			index;
***************
*** 1722,1733 **** CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
  		if (procvxid.backendId == vxid.backendId &&
  			procvxid.localTransactionId == vxid.localTransactionId)
  		{
- 			/*
- 			 * Issue orders for the proc to read next time it receives SIGINT
- 			 */
- 			if (proc->recoveryConflictMode < cancel_mode)
- 				proc->recoveryConflictMode = cancel_mode;
- 
  			pid = proc->pid;
  			break;
  		}
--- 1722,1727 ----
***************
*** 1741,1747 **** CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
  		 * Kill the pid if it's still here. If not, that's what we wanted
  		 * so ignore any errors.
  		 */
! 		kill(pid, SIGINT);
  	}
  
  	return pid;
--- 1735,1741 ----
  		 * Kill the pid if it's still here. If not, that's what we wanted
  		 * so ignore any errors.
  		 */
! 		(void) SendProcSignal(pid, sigmode, vxid.backendId);
  	}
  
  	return pid;
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***************
*** 24,29 ****
--- 24,30 ----
  #include "storage/procsignal.h"
  #include "storage/shmem.h"
  #include "storage/sinval.h"
+ #include "storage/standby.h"
  
  
  /*
***************
*** 258,262 **** procsignal_sigusr1_handler(SIGNAL_ARGS)
--- 259,269 ----
  	if (CheckProcSignal(PROCSIG_NOTIFY_INTERRUPT))
  		HandleNotifyInterrupt();
  
+ 	if (CheckProcSignal(PROCSIG_CONFLICT_ERROR_INTERRUPT))
+ 		HandleConflictInterrupt(ERROR);
+ 
+ 	if (CheckProcSignal(PROCSIG_CONFLICT_FATAL_INTERRUPT))
+ 		HandleConflictInterrupt(FATAL);
+ 
  	errno = save_errno;
  }
*** a/src/backend/storage/ipc/standby.c
--- b/src/backend/storage/ipc/standby.c
***************
*** 218,229 **** ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
  			if (WaitExceedsMaxStandbyDelay())
  			{
  				pid_t pid;
  
  				/*
  				 * Now find out who to throw out of the balloon.
  				 */
  				Assert(VirtualTransactionIdIsValid(*waitlist));
! 				pid = CancelVirtualTransaction(*waitlist, cancel_mode);
  
  				if (pid != 0)
  				{
--- 218,235 ----
  			if (WaitExceedsMaxStandbyDelay())
  			{
  				pid_t pid;
+ 				ProcSignalReason	sigmode;
+ 
+ 				if (cancel_mode == CONFLICT_MODE_ERROR)
+ 					sigmode = PROCSIG_CONFLICT_ERROR_INTERRUPT;
+ 				else
+ 					sigmode = PROCSIG_CONFLICT_FATAL_INTERRUPT;
  
  				/*
  				 * Now find out who to throw out of the balloon.
  				 */
  				Assert(VirtualTransactionIdIsValid(*waitlist));
! 				pid = CancelVirtualTransaction(*waitlist, sigmode);
  
  				if (pid != 0)
  				{
***************
*** 272,277 **** ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
--- 278,305 ----
      }
  }
  
+ void
+ HandleConflictInterrupt(int conflict_mode)
+ {
+ 	switch (conflict_mode)
+ 	{
+ 		case FATAL:
+ 			SessionCancelPending = true;
+ 			RecoveryConflictPending = true;
+ 			ProcessInterrupts();				/* Should not return */
+ 			return;
+ 
+ 		case ERROR:
+ 			TransactionCancelPending = true;
+ 			RecoveryConflictPending = true;
+ 			ProcessInterrupts();
+ 			return;
+ 
+ 		default:
+ 			elog(ERROR, "Unknown conflict mode");
+ 	}
+ }
+ 
  /*
   * -----------------------------------------------------
   * Locking in Recovery Mode
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 2737,2795 **** ProcessInterrupts(void)
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling autovacuum task")));
! 		else
! 		{
! 			int cancelMode = MyProc->recoveryConflictMode;
! 
! 			/*
! 			 * XXXHS: We don't yet have a clean way to cancel an
! 			 * idle-in-transaction session, so make it FATAL instead.
! 			 * This isn't as bad as it looks because we don't issue a
! 			 * CONFLICT_MODE_ERROR for a session with proc->xmin == 0
! 			 * on cleanup conflicts. There's a possibility that we
! 			 * marked somebody as a conflict and then they go idle.
! 			 */
! 			if (DoingCommandRead && IsTransactionBlock() &&
! 				cancelMode == CONFLICT_MODE_ERROR)
! 			{
! 				cancelMode = CONFLICT_MODE_FATAL;
! 			}
! 
! 			switch (cancelMode)
! 			{
! 				case CONFLICT_MODE_FATAL:
! 						Assert(RecoveryInProgress());
! 						ereport(FATAL,
! 							(errcode(ERRCODE_QUERY_CANCELED),
! 							 errmsg("canceling session due to conflict with recovery")));
! 
! 				case CONFLICT_MODE_ERROR:
! 						/*
! 						 * We are aborting because we need to release
! 						 * locks. So we need to abort out of all
! 						 * subtransactions to make sure we release
! 						 * all locks at whatever their level.
! 						 *
! 						 * XXX Should we try to examine the
! 						 * transaction tree and cancel just enough
! 						 * subxacts to remove locks? Doubt it.
! 						 */
! 						Assert(RecoveryInProgress());
! 						AbortOutOfAnyTransaction();
! 						ereport(ERROR,
! 							(errcode(ERRCODE_QUERY_CANCELED),
! 							 errmsg("canceling statement due to conflict with recovery")));
! 						return;
! 
! 				default:
! 						/* No conflict pending, so fall through */
! 						break;
! 			}
! 
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling statement due to user request")));
- 		}
  	}
  	if (TransactionCancelPending)
  	{
--- 2737,2746 ----
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling autovacuum task")));
! 		else 
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling statement due to user request")));
  	}
  	if (TransactionCancelPending)
  	{
***************
*** 2802,2810 **** ProcessInterrupts(void)
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
  
! 		ereport(NOTICE,
! 				(errcode(ERRCODE_QUERY_CANCELED),
! 				 errmsg("canceling idle transaction due to administrator request")));
  
  		AbortTransactionAndAnySubtransactions();
  
--- 2753,2766 ----
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
  
! 		if (RecoveryConflictPending)
! 			ereport(NOTICE,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling transaction due to conflict with recovery")));
! 		else
! 			ereport(NOTICE,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling idle transaction due to administrator request")));
  
  		AbortTransactionAndAnySubtransactions();
  
***************
*** 2815,2820 **** ProcessInterrupts(void)
--- 2771,2790 ----
  		set_ps_display("idle in transaction (aborted)", false);
  		pgstat_report_activity("<IDLE> in transaction (aborted)");
  	}
+ 	if (SessionCancelPending)
+ 	{
+ 		SessionCancelPending = false;
+ 		ImmediateInterruptOK = false;	/* not idle anymore */
+ 
+ 		if (RecoveryConflictPending)
+ 			ereport(FATAL,
+ 					(errcode(ERRCODE_QUERY_CANCELED),
+ 					 errmsg("canceling session due to conflict with recovery")));
+ 		else
+ 			ereport(FATAL,
+ 					(errcode(ERRCODE_QUERY_CANCELED),
+ 					 errmsg("canceling session due to administrator request")));
+ 	}
  	/* If we get here, do nothing (probably, QueryCancelPending was reset) */
  }
  
*** a/src/backend/utils/init/globals.c
--- b/src/backend/utils/init/globals.c
***************
*** 28,33 **** ProtocolVersion FrontendProtocol;
--- 28,35 ----
  volatile bool InterruptPending = false;
  volatile bool QueryCancelPending = false;
  volatile bool TransactionCancelPending = false;
+ volatile bool SessionCancelPending = false;
+ volatile bool RecoveryConflictPending = false;
  volatile bool ProcDiePending = false;
  volatile bool ImmediateInterruptOK = false;
  volatile uint32 InterruptHoldoffCount = 0;
*** a/src/include/miscadmin.h
--- b/src/include/miscadmin.h
***************
*** 69,74 ****
--- 69,76 ----
  extern PGDLLIMPORT volatile bool InterruptPending;
  extern volatile bool QueryCancelPending;
  extern volatile bool TransactionCancelPending;
+ extern volatile bool SessionCancelPending;
+ extern volatile bool RecoveryConflictPending;
  extern volatile bool ProcDiePending;
  
  /* these are marked volatile because they are examined by signal handlers: */
*** a/src/include/storage/procarray.h
--- b/src/include/storage/procarray.h
***************
*** 15,20 ****
--- 15,21 ----
  #define PROCARRAY_H
  
  #include "storage/lock.h"
+ #include "storage/procsignal.h"
  #include "storage/standby.h"
  #include "utils/snapshot.h"
  
***************
*** 58,65 **** extern VirtualTransactionId *GetCurrentVirtualXIDs(TransactionId limitXmin,
  					  int *nvxids);
  extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
  					Oid dbOid, bool skipExistingConflicts);
! extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid,
! 						 int cancel_mode);
  
  extern int	CountActiveBackends(void);
  extern int	CountDBBackends(Oid databaseid);
--- 59,65 ----
  					  int *nvxids);
  extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
  					Oid dbOid, bool skipExistingConflicts);
! extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode);
  
  extern int	CountActiveBackends(void);
  extern int	CountDBBackends(Oid databaseid);
*** a/src/include/storage/procsignal.h
--- b/src/include/storage/procsignal.h
***************
*** 31,36 **** typedef enum
--- 31,38 ----
  {
  	PROCSIG_CATCHUP_INTERRUPT,	/* sinval catchup interrupt */
  	PROCSIG_NOTIFY_INTERRUPT,	/* listen/notify interrupt */
+ 	PROCSIG_CONFLICT_ERROR_INTERRUPT,	/* recovery conflict error */
+ 	PROCSIG_CONFLICT_FATAL_INTERRUPT,	/* recovery conflict fatal */
  
  	NUM_PROCSIGNALS				/* Must be last! */
  } ProcSignalReason;
*** a/src/include/storage/standby.h
--- b/src/include/storage/standby.h
***************
*** 26,31 **** extern int	vacuum_defer_cleanup_age;
--- 26,32 ----
  
  extern void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
  									   char *reason, int cancel_mode);
+ extern void HandleConflictInterrupt(int conflict_mode);
  
  extern void InitRecoveryTransactionEnvironment(void);
  extern void ShutdownRecoveryTransactionEnvironment(void);
#19Joachim Wieland
joe@mcknight.de
In reply to: Simon Riggs (#18)
1 attachment(s)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, Dec 31, 2009 at 6:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Building racy infrastructure when it can be avoided with a little care
still seems not to be the best path to me.

Doing that will add more complexity in an area that is hard to test
effectively. I think the risk of introducing further bugs while trying
to fix this rare condition is high. Right now the conflict processing
needs more work and is often much less precise than this, so improving
this aspect of it would not be a priority. I've added it to the TODO
though. Thank you for your research.

Patch implements recovery conflict signalling using SIGUSR1
multiplexing, then uses a SessionCancelPending mode similar to Joachim's
TransactionCancelPending.

I have reworked Simon's patch a bit and attach the result.

Quick facts:

- Hot Standby only uses SIGUSR1
- SIGINT behaves as it did before: it only cancels running statements
- pg_cancel_backend() continues to use SIGINT
- I added pg_cancel_idle_transaction() to cancel an idle transaction via
SIGUSR1
- One central function HandleCancelAction() sets the flags before calling
ProcessInterrupts(), it is called from the different signal handlers and
receives parameters about what it should do
- If a SIGUSR1 reason is used that will cancel something, ProcArrayLock is
acquired until the signal has been sent to make sure that we won't signal the
wrong backend. Does this sufficiently cover the concerns of Andres Freund
discussed upthread?

As there were so many boolean SomethingCancelPending variables I changed them
to be bitmasks and merged all of them into a single variable. However I am not
sure if we can change the name and type of especially InterruptPending that
easily as it was marked with PGDLLIMPORT...

@Simon: Is there a reason why you have not yet removed recoveryConflictMode
from PGPROC?

Joachim

Attachments:

hs_signaling.1.difftext/x-diff; charset=US-ASCII; name=hs_signaling.1.diffDownload
diff -cr cvs/src/backend/access/transam/xact.c cvs.build/src/backend/access/transam/xact.c
*** cvs/src/backend/access/transam/xact.c	2009-12-24 13:55:12.000000000 +0100
--- cvs.build/src/backend/access/transam/xact.c	2010-01-05 11:25:17.000000000 +0100
***************
*** 313,320 ****
  /*
   *	IsAbortedTransactionBlockState
   *
!  *	This returns true if we are currently running a query
!  *	within an aborted transaction block.
   */
  bool
  IsAbortedTransactionBlockState(void)
--- 313,319 ----
  /*
   *	IsAbortedTransactionBlockState
   *
!  *	This returns true if we are within an aborted transaction block.
   */
  bool
  IsAbortedTransactionBlockState(void)
***************
*** 2692,2697 ****
--- 2691,2738 ----
  }
  
  /*
+  *	AbortTransactionAndAnySubtransactions
+  *
+  * Similar to AbortCurrentTransaction but if any subtransactions
+  * in progress we abort them and all of their parents. So this is
+  * used when the caller wishes to make the abort untrappable by the user.
+  */
+ void
+ AbortTransactionAndAnySubtransactions(void)
+ {
+ 	TransactionState s = CurrentTransactionState;
+ 
+ 	switch (s->blockState)
+ 	{
+ 		case TBLOCK_DEFAULT:
+ 		case TBLOCK_STARTED:
+ 		case TBLOCK_BEGIN:
+ 		case TBLOCK_INPROGRESS:
+ 		case TBLOCK_END:
+ 		case TBLOCK_ABORT:
+ 		case TBLOCK_SUBABORT:
+ 		case TBLOCK_ABORT_END:
+ 		case TBLOCK_ABORT_PENDING:
+ 		case TBLOCK_PREPARE:
+ 		case TBLOCK_SUBABORT_END:
+ 		case TBLOCK_SUBABORT_RESTART:
+ 			AbortCurrentTransaction();
+ 			break;
+ 
+ 		case TBLOCK_SUBINPROGRESS:
+ 		case TBLOCK_SUBBEGIN:
+ 		case TBLOCK_SUBEND:
+ 		case TBLOCK_SUBABORT_PENDING:
+ 		case TBLOCK_SUBRESTART:
+ 			AbortSubTransaction();
+ 			CleanupSubTransaction();
+ 			AbortTransactionAndAnySubtransactions();
+ 			break;
+ 	}
+ }
+ 
+ 
+ /*
   *	PreventTransactionChain
   *
   *	This routine is to be called by statements that must not run inside
diff -cr cvs/src/backend/commands/vacuum.c cvs.build/src/backend/commands/vacuum.c
*** cvs/src/backend/commands/vacuum.c	2009-12-24 13:55:13.000000000 +0100
--- cvs.build/src/backend/commands/vacuum.c	2010-01-03 20:24:26.000000000 +0100
***************
*** 3936,3942 ****
  	CHECK_FOR_INTERRUPTS();
  
  	/* Nap if appropriate */
! 	if (VacuumCostActive && !InterruptPending &&
  		VacuumCostBalance >= VacuumCostLimit)
  	{
  		int			msec;
--- 3936,3942 ----
  	CHECK_FOR_INTERRUPTS();
  
  	/* Nap if appropriate */
! 	if (VacuumCostActive && !IsInterruptEventPending &&
  		VacuumCostBalance >= VacuumCostLimit)
  	{
  		int			msec;
diff -cr cvs/src/backend/postmaster/autovacuum.c cvs.build/src/backend/postmaster/autovacuum.c
*** cvs/src/backend/postmaster/autovacuum.c	2009-12-09 11:24:41.000000000 +0100
--- cvs.build/src/backend/postmaster/autovacuum.c	2010-01-07 00:22:21.000000000 +0100
***************
*** 479,488 ****
  		/* Prevents interrupts while cleaning up */
  		HOLD_INTERRUPTS();
  
! 		/* Forget any pending QueryCancel request */
! 		QueryCancelPending = false;
  		disable_sig_alarm(true);
! 		QueryCancelPending = false;		/* again in case timeout occurred */
  
  		/* Report the error to the server log */
  		EmitErrorReport();
--- 479,489 ----
  		/* Prevents interrupts while cleaning up */
  		HOLD_INTERRUPTS();
  
! 		/* Forget any pending CancelQuery request */
! 		ClearInterruptEventPending(CancelQueryPending);
  		disable_sig_alarm(true);
! 		/* again in case timeout occurred */
! 		ClearInterruptEventPending(CancelQueryPending);
  
  		/* Report the error to the server log */
  		EmitErrorReport();
***************
*** 2233,2244 ****
  			autovacuum_do_vac_analyze(tab, bstrategy);
  
  			/*
! 			 * Clear a possible query-cancel signal, to avoid a late reaction
  			 * to an automatically-sent signal because of vacuuming the
  			 * current table (we're done with it, so it would make no sense to
  			 * cancel at this point.)
  			 */
! 			QueryCancelPending = false;
  		}
  		PG_CATCH();
  		{
--- 2234,2245 ----
  			autovacuum_do_vac_analyze(tab, bstrategy);
  
  			/*
! 			 * Clear a possible cancel-query signal, to avoid a late reaction
  			 * to an automatically-sent signal because of vacuuming the
  			 * current table (we're done with it, so it would make no sense to
  			 * cancel at this point.)
  			 */
! 			ClearInterruptEventPending(CancelQueryPending);
  		}
  		PG_CATCH();
  		{
diff -cr cvs/src/backend/storage/ipc/ipc.c cvs.build/src/backend/storage/ipc/ipc.c
*** cvs/src/backend/storage/ipc/ipc.c	2009-12-09 11:24:52.000000000 +0100
--- cvs.build/src/backend/storage/ipc/ipc.c	2010-01-03 20:33:49.000000000 +0100
***************
*** 155,163 ****
  	 * close up shop already.  Note that the signal handlers will not set
  	 * these flags again, now that proc_exit_inprogress is set.
  	 */
! 	InterruptPending = false;
! 	ProcDiePending = false;
! 	QueryCancelPending = false;
  	/* And let's just make *sure* we're not interrupted ... */
  	ImmediateInterruptOK = false;
  	InterruptHoldoffCount = 1;
--- 155,161 ----
  	 * close up shop already.  Note that the signal handlers will not set
  	 * these flags again, now that proc_exit_inprogress is set.
  	 */
! 	ZeroInterruptEventPending();
  	/* And let's just make *sure* we're not interrupted ... */
  	ImmediateInterruptOK = false;
  	InterruptHoldoffCount = 1;
diff -cr cvs/src/backend/storage/ipc/procarray.c cvs.build/src/backend/storage/ipc/procarray.c
*** cvs/src/backend/storage/ipc/procarray.c	2009-12-24 13:55:18.000000000 +0100
--- cvs.build/src/backend/storage/ipc/procarray.c	2010-01-06 20:02:31.000000000 +0100
***************
*** 52,57 ****
--- 52,58 ----
  #include "access/twophase.h"
  #include "miscadmin.h"
  #include "storage/procarray.h"
+ #include "storage/procsignal.h"
  #include "storage/standby.h"
  #include "utils/builtins.h"
  #include "utils/snapmgr.h"
***************
*** 1704,1710 ****
   * Returns pid of the process signaled, or 0 if not found.
   */
  pid_t
! CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
  {
  	ProcArrayStruct *arrayP = procArray;
  	int			index;
--- 1705,1711 ----
   * Returns pid of the process signaled, or 0 if not found.
   */
  pid_t
! CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
  {
  	ProcArrayStruct *arrayP = procArray;
  	int			index;
***************
*** 1722,1733 ****
  		if (procvxid.backendId == vxid.backendId &&
  			procvxid.localTransactionId == vxid.localTransactionId)
  		{
- 			/*
- 			 * Issue orders for the proc to read next time it receives SIGINT
- 			 */
- 			if (proc->recoveryConflictMode < cancel_mode)
- 				proc->recoveryConflictMode = cancel_mode;
- 
  			pid = proc->pid;
  			break;
  		}
--- 1723,1728 ----
***************
*** 1741,1747 ****
  		 * Kill the pid if it's still here. If not, that's what we wanted
  		 * so ignore any errors.
  		 */
! 		kill(pid, SIGINT);
  	}
  
  	return pid;
--- 1736,1742 ----
  		 * Kill the pid if it's still here. If not, that's what we wanted
  		 * so ignore any errors.
  		 */
! 		(void) SendProcSignal(pid, sigmode, vxid.backendId);
  	}
  
  	return pid;
diff -cr cvs/src/backend/storage/ipc/procsignal.c cvs.build/src/backend/storage/ipc/procsignal.c
*** cvs/src/backend/storage/ipc/procsignal.c	2009-12-09 11:24:52.000000000 +0100
--- cvs.build/src/backend/storage/ipc/procsignal.c	2010-01-07 12:00:33.000000000 +0100
***************
*** 24,29 ****
--- 24,31 ----
  #include "storage/procsignal.h"
  #include "storage/shmem.h"
  #include "storage/sinval.h"
+ #include "storage/standby.h"
+ #include "tcop/tcopprot.h"
  
  
  /*
***************
*** 170,195 ****
  SendProcSignal(pid_t pid, ProcSignalReason reason, BackendId backendId)
  {
  	volatile ProcSignalSlot *slot;
  
  	if (backendId != InvalidBackendId)
  	{
! 		slot = &ProcSignalSlots[backendId - 1];
! 
  		/*
! 		 * Note: Since there's no locking, it's possible that the target
  		 * process detaches from shared memory and exits right after this
! 		 * test, before we set the flag and send signal. And the signal slot
! 		 * might even be recycled by a new process, so it's remotely possible
! 		 * that we set a flag for a wrong process. That's OK, all the signals
! 		 * are such that no harm is done if they're mistakenly fired.
  		 */
  		if (slot->pss_pid == pid)
  		{
  			/* Atomically set the proper flag */
  			slot->pss_signalFlags[reason] = true;
  			/* Send signal */
! 			return kill(pid, SIGUSR1);
  		}
  	}
  	else
  	{
--- 172,210 ----
  SendProcSignal(pid_t pid, ProcSignalReason reason, BackendId backendId)
  {
  	volatile ProcSignalSlot *slot;
+ 	int						 ret;
  
  	if (backendId != InvalidBackendId)
  	{
! 		Assert(backendId <= MaxBackends);
  		/*
! 		 * If we did not get a lock, it would be possible that the target
  		 * process detaches from shared memory and exits right after this
! 		 * test, before we set the flag and send the signal. And the signal
! 		 * slot might even be recycled by a new process, so it's remotely
! 		 * possible that we set a flag for a wrong process. That's OK for
! 		 * harmless signal reasons. As soon as we get a cancel-something signal
! 		 * reason (which really should not happen all that often), we take the
! 		 * time to grab a shared lock on ProcArray before we start to search
! 		 * that backend and hold it until we have sent the signal.
  		 */
+ 		if (!IsHarmlessSignal(reason))
+ 			LWLockAcquire(ProcArrayLock, LW_SHARED);
+ 
+ 		slot = &ProcSignalSlots[backendId - 1];
+ 
  		if (slot->pss_pid == pid)
  		{
  			/* Atomically set the proper flag */
  			slot->pss_signalFlags[reason] = true;
  			/* Send signal */
! 			ret = kill(pid, SIGUSR1);
  		}
+ 
+ 		if (!IsHarmlessSignal(reason))
+ 			LWLockRelease(ProcArrayLock);
+ 
+ 		return ret;
  	}
  	else
  	{
***************
*** 200,205 ****
--- 215,228 ----
  		 * process, which will have a slot near the end of the array.
  		 */
  		int		i;
+ 		bool	found = false;
+ 
+ 		/*
+ 		 * If the signal is not a harmless one, grab a lock to be sure we
+ 		 * send it to the correct backend.
+ 		 */
+ 		if (!IsHarmlessSignal(reason))
+ 			LWLockAcquire(ProcArrayLock, LW_SHARED);
  
  		for (i = NumProcSignalSlots - 1; i >= 0; i--)
  		{
***************
*** 207,220 ****
  
  			if (slot->pss_pid == pid)
  			{
! 				/* the above note about race conditions applies here too */
  
  				/* Atomically set the proper flag */
  				slot->pss_signalFlags[reason] = true;
  				/* Send signal */
! 				return kill(pid, SIGUSR1);
  			}
  		}
  	}
  
  	errno = ESRCH;
--- 230,250 ----
  
  			if (slot->pss_pid == pid)
  			{
! 				found = true;
  
  				/* Atomically set the proper flag */
  				slot->pss_signalFlags[reason] = true;
  				/* Send signal */
! 				ret = kill(pid, SIGUSR1);
! 				break;
  			}
  		}
+ 
+ 		if (!IsHarmlessSignal(reason))
+ 			LWLockRelease(ProcArrayLock);
+ 
+ 		if (found)
+ 			return ret;
  	}
  
  	errno = ESRCH;
***************
*** 252,262 ****
--- 282,314 ----
  {
  	int		save_errno = errno;
  
+ 	/* signal reasons that notify a backend of an event */
  	if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT))
  		HandleCatchupInterrupt();
  
  	if (CheckProcSignal(PROCSIG_NOTIFY_INTERRUPT))
  		HandleNotifyInterrupt();
  
+ 	/* signal reasons that request the backend to cancel some action */
+ 
+ 	/* pg_cancel_idle_transaction() */
+ 	if (CheckProcSignal(PROCSIG_CANCEL_IDLE_TRANSACTION))
+ 		HandleCancelAction(false, true, false);
+ 
+ 	/* Hot standby */
+ 	if (CheckProcSignal(PROCSIG_CONFLICT_ERROR_INTERRUPT))
+ 	{
+ 		SetInterruptEventPending(ConflictErrorPending);
+ 		HandleCancelAction(true, true, false);
+ 	}
+ 
+ 	/* Hot standby */
+ 	if (CheckProcSignal(PROCSIG_CONFLICT_FATAL_INTERRUPT))
+ 	{
+ 		SetInterruptEventPending(ConflictFatalPending);
+ 		HandleCancelAction(false, false, true);
+ 	}
+ 
  	errno = save_errno;
  }
+ 
diff -cr cvs/src/backend/storage/ipc/standby.c cvs.build/src/backend/storage/ipc/standby.c
*** cvs/src/backend/storage/ipc/standby.c	2009-12-19 02:32:35.000000000 +0100
--- cvs.build/src/backend/storage/ipc/standby.c	2010-01-06 19:35:02.000000000 +0100
***************
*** 219,229 ****
  			{
  				pid_t pid;
  
  				/*
  				 * Now find out who to throw out of the balloon.
  				 */
  				Assert(VirtualTransactionIdIsValid(*waitlist));
! 				pid = CancelVirtualTransaction(*waitlist, cancel_mode);
  
  				if (pid != 0)
  				{
--- 219,236 ----
  			{
  				pid_t pid;
  
+ 				ProcSignalReason	sigmode;
+ 
+ 				if (cancel_mode == CONFLICT_MODE_ERROR)
+ 					sigmode = PROCSIG_CONFLICT_ERROR_INTERRUPT;
+ 				else
+ 					sigmode = PROCSIG_CONFLICT_FATAL_INTERRUPT;
+ 
  				/*
  				 * Now find out who to throw out of the balloon.
  				 */
  				Assert(VirtualTransactionIdIsValid(*waitlist));
! 				pid = CancelVirtualTransaction(*waitlist, sigmode);
  
  				if (pid != 0)
  				{
diff -cr cvs/src/backend/tcop/postgres.c cvs.build/src/backend/tcop/postgres.c
*** cvs/src/backend/tcop/postgres.c	2009-12-24 13:55:18.000000000 +0100
--- cvs.build/src/backend/tcop/postgres.c	2010-01-07 12:46:26.000000000 +0100
***************
*** 479,486 ****
  		/* Allow "die" interrupt to be processed while waiting */
  		ImmediateInterruptOK = true;
  
! 		/* And don't forget to detect one that already arrived */
! 		QueryCancelPending = false;
  		CHECK_FOR_INTERRUPTS();
  	}
  }
--- 479,486 ----
  		/* Allow "die" interrupt to be processed while waiting */
  		ImmediateInterruptOK = true;
  
! 		/* And don't forget to detect those that already arrived */
! 		ZeroInterruptEventPending();
  		CHECK_FOR_INTERRUPTS();
  	}
  }
***************
*** 494,500 ****
  	if (DoingCommandRead)
  	{
  		ImmediateInterruptOK = false;
! 		QueryCancelPending = false;		/* forget any CANCEL signal */
  
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
--- 494,501 ----
  	if (DoingCommandRead)
  	{
  		ImmediateInterruptOK = false;
! 		/* forget any CANCEL signal */
! 		ZeroInterruptEventPending();
  
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
***************
*** 2593,2661 ****
  void
  die(SIGNAL_ARGS)
  {
! 	int			save_errno = errno;
! 
! 	/* Don't joggle the elbow of proc_exit */
! 	if (!proc_exit_inprogress)
! 	{
! 		InterruptPending = true;
! 		ProcDiePending = true;
! 
! 		/*
! 		 * If it's safe to interrupt, and we're waiting for input or a lock,
! 		 * service the interrupt immediately
! 		 */
! 		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
! 			CritSectionCount == 0)
! 		{
! 			/* bump holdoff count to make ProcessInterrupts() a no-op */
! 			/* until we are done getting ready for it */
! 			InterruptHoldoffCount++;
! 			LockWaitCancel();	/* prevent CheckDeadLock from running */
! 			DisableNotifyInterrupt();
! 			DisableCatchupInterrupt();
! 			InterruptHoldoffCount--;
! 			ProcessInterrupts();
! 		}
! 	}
! 
! 	errno = save_errno;
  }
  
  /*
!  * Query-cancel signal from postmaster: abort current transaction
!  * at soonest convenient time
   */
  void
  StatementCancelHandler(SIGNAL_ARGS)
  {
  	int			save_errno = errno;
  
  	/*
! 	 * Don't joggle the elbow of proc_exit
  	 */
! 	if (!proc_exit_inprogress)
! 	{
! 		InterruptPending = true;
! 		QueryCancelPending = true;
  
! 		/*
! 		 * If it's safe to interrupt, and we're waiting for a lock, service
! 		 * the interrupt immediately.  No point in interrupting if we're
! 		 * waiting for input, however.
! 		 */
! 		if (InterruptHoldoffCount == 0 && CritSectionCount == 0 &&
! 			(DoingCommandRead || ImmediateInterruptOK))
! 		{
! 			/* bump holdoff count to make ProcessInterrupts() a no-op */
! 			/* until we are done getting ready for it */
! 			InterruptHoldoffCount++;
! 			LockWaitCancel();	/* prevent CheckDeadLock from running */
! 			DisableNotifyInterrupt();
! 			DisableCatchupInterrupt();
! 			InterruptHoldoffCount--;
! 			ProcessInterrupts();
! 		}
  	}
  
  	errno = save_errno;
--- 2594,2668 ----
  void
  die(SIGNAL_ARGS)
  {
! 	HandleCancelAction(false, false, true);
  }
  
  /*
!  * Query-cancel signal from postmaster: abort current query at soonest
!  * convenient time, do not abort an idle transaction however.
   */
  void
  StatementCancelHandler(SIGNAL_ARGS)
  {
+ 	HandleCancelAction(true, false, false);
+ }
+ 
+ void
+ HandleCancelAction(bool cancelQuery,
+ 				   bool cancelIdleTransaction,
+ 				   bool cancelSession)
+ {
  	int			save_errno = errno;
  
+ 	/* Don't joggle the elbow of proc_exit */
+ 	if (proc_exit_inprogress)
+ 		return;
+ 
  	/*
! 	 * Check cancel requests against internal states, we might want to
! 	 * ignore a request.
  	 */
! 	if (cancelQuery && DoingCommandRead)
! 		cancelQuery = false;
! 	if (cancelIdleTransaction)
! 	{
! 		if (!DoingCommandRead)
! 			cancelIdleTransaction = false;
! 		else if (!IsTransactionOrTransactionBlock())
! 			cancelIdleTransaction = false;
! 		else if (IsAbortedTransactionBlockState())
! 			cancelIdleTransaction = false;
! 	}
  
! 	if (!cancelQuery && !cancelIdleTransaction && !cancelSession)
! 		return;
! 
! 	/*
! 	 * Set the global variable InterruptPendingEvent to whatever is left set
! 	 * to true after the previous checks.
! 	 */
! 	if (cancelQuery)
! 		SetInterruptEventPending(CancelQueryPending);
! 	if (cancelIdleTransaction)
! 		SetInterruptEventPending(CancelIdleTransactionPending);
! 	if (cancelSession)
! 		SetInterruptEventPending(CancelSessionPending);
! 
! 	/*
! 	 * If it's safe to interrupt, and we're waiting for a lock, service
! 	 * the interrupt immediately.
! 	 */
! 	if (InterruptHoldoffCount == 0 && CritSectionCount == 0 &&
! 		ImmediateInterruptOK)
! 	{
! 		/* bump holdoff count to make ProcessInterrupts() a no-op */
! 		/* until we are done getting ready for it */
! 		InterruptHoldoffCount++;
! 		LockWaitCancel();	/* prevent CheckDeadLock from running */
! 		DisableNotifyInterrupt();
! 		DisableCatchupInterrupt();
! 		InterruptHoldoffCount--;
! 		ProcessInterrupts();
  	}
  
  	errno = save_errno;
***************
*** 2685,2704 ****
   * ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
   *
   * If an interrupt condition is pending, and it's safe to service it,
!  * then clear the flag and accept the interrupt.  Called only when
!  * InterruptPending is true.
   */
  void
  ProcessInterrupts(void)
  {
  	/* OK to accept interrupt now? */
  	if (InterruptHoldoffCount != 0 || CritSectionCount != 0)
  		return;
! 	InterruptPending = false;
! 	if (ProcDiePending)
  	{
- 		ProcDiePending = false;
- 		QueryCancelPending = false;		/* ProcDie trumps QueryCancel */
  		ImmediateInterruptOK = false;	/* not idle anymore */
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
--- 2692,2712 ----
   * ProcessInterrupts: out-of-line portion of CHECK_FOR_INTERRUPTS() macro
   *
   * If an interrupt condition is pending, and it's safe to service it,
!  * then clear the flag and accept the interrupt. Called only when
!  * InterruptPendingEvent != NothingPending.
   */
  void
  ProcessInterrupts(void)
  {
+ 	const char *msg;
+ 	int			code;
+ 
  	/* OK to accept interrupt now? */
  	if (InterruptHoldoffCount != 0 || CritSectionCount != 0)
  		return;
! 
! 	if (InterruptPendingEvent & CancelSessionPending)
  	{
  		ImmediateInterruptOK = false;	/* not idle anymore */
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
***************
*** 2706,2722 ****
  		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
  			whereToSendOutput = DestNone;
  		if (IsAutoVacuumWorkerProcess())
! 			ereport(FATAL,
! 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
! 					 errmsg("terminating autovacuum process due to administrator command")));
  		else
! 			ereport(FATAL,
! 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
! 			 errmsg("terminating connection due to administrator command")));
  	}
! 	if (QueryCancelPending)
  	{
- 		QueryCancelPending = false;
  		ImmediateInterruptOK = false;	/* not idle anymore */
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
--- 2714,2742 ----
  		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
  			whereToSendOutput = DestNone;
  		if (IsAutoVacuumWorkerProcess())
! 		{
! 			code = ERRCODE_ADMIN_SHUTDOWN;
! 			msg = "terminating autovacuum process due to administrator command";
! 		}
! 		else if (InterruptPendingEvent & ConflictFatalPending)
! 		{
! 			Assert(RecoveryInProgress());
! 			code = ERRCODE_QUERY_CANCELED;
! 			msg = "canceling session due to conflict with recovery";
! 		}
  		else
! 		{
! 			code = ERRCODE_ADMIN_SHUTDOWN;
! 			msg = "terminating connection due to administrator command";
! 		}
! 
! 		ZeroInterruptEventPending();
! 
! 		ereport(FATAL, (errcode(code),
! 						errmsg(msg)));
  	}
! 	if (InterruptPendingEvent & CancelQueryPending)
  	{
  		ImmediateInterruptOK = false;	/* not idle anymore */
  		DisableNotifyInterrupt();
  		DisableCatchupInterrupt();
***************
*** 2724,2795 ****
  		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
  			whereToSendOutput = DestNone;
  		if (ClientAuthInProgress)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling authentication due to timeout")));
  		else if (cancel_from_timeout)
! 			ereport(ERROR,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling statement due to statement timeout")));
  		else if (IsAutoVacuumWorkerProcess())
! 			ereport(ERROR,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling autovacuum task")));
! 		else
  		{
! 			int cancelMode = MyProc->recoveryConflictMode;
  
! 			/*
! 			 * XXXHS: We don't yet have a clean way to cancel an
! 			 * idle-in-transaction session, so make it FATAL instead.
! 			 * This isn't as bad as it looks because we don't issue a
! 			 * CONFLICT_MODE_ERROR for a session with proc->xmin == 0
! 			 * on cleanup conflicts. There's a possibility that we
! 			 * marked somebody as a conflict and then they go idle.
! 			 */
! 			if (DoingCommandRead && IsTransactionBlock() &&
! 				cancelMode == CONFLICT_MODE_ERROR)
! 			{
! 				cancelMode = CONFLICT_MODE_FATAL;
! 			}
  
! 			switch (cancelMode)
! 			{
! 				case CONFLICT_MODE_FATAL:
! 						Assert(RecoveryInProgress());
! 						ereport(FATAL,
! 							(errcode(ERRCODE_QUERY_CANCELED),
! 							 errmsg("canceling session due to conflict with recovery")));
! 
! 				case CONFLICT_MODE_ERROR:
! 						/*
! 						 * We are aborting because we need to release
! 						 * locks. So we need to abort out of all
! 						 * subtransactions to make sure we release
! 						 * all locks at whatever their level.
! 						 *
! 						 * XXX Should we try to examine the
! 						 * transaction tree and cancel just enough
! 						 * subxacts to remove locks? Doubt it.
! 						 */
! 						Assert(RecoveryInProgress());
! 						AbortOutOfAnyTransaction();
! 						ereport(ERROR,
! 							(errcode(ERRCODE_QUERY_CANCELED),
! 							 errmsg("canceling statement due to conflict with recovery")));
! 						return;
! 
! 				default:
! 						/* No conflict pending, so fall through */
! 						break;
! 			}
  
! 			ereport(ERROR,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling statement due to user request")));
  		}
  	}
! 	/* If we get here, do nothing (probably, QueryCancelPending was reset) */
  }
  
  
--- 2744,2794 ----
  		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
  			whereToSendOutput = DestNone;
  		if (ClientAuthInProgress)
! 			msg = "canceling authentication due to timeout";
  		else if (cancel_from_timeout)
! 			msg = "canceling statement due to statement timeout";
  		else if (IsAutoVacuumWorkerProcess())
! 			msg = "canceling autovacuum task";
! 		else if (InterruptPendingEvent & ConflictErrorPending)
  		{
! 			Assert(RecoveryInProgress());
! 			msg = "canceling statement due to conflict with recovery";
! 		}
! 		else
! 			msg = "canceling statement due to user request";
  
! 		ZeroInterruptEventPending();
  
! 		ereport(ERROR,
! 				(errcode(ERRCODE_QUERY_CANCELED),
! 				 errmsg(msg)));
! 	}
! 	if (InterruptPendingEvent & CancelIdleTransactionPending)
! 	{
! 		ImmediateInterruptOK = false;	/* not idle anymore */
! 		DisableNotifyInterrupt();
! 		DisableCatchupInterrupt();
  
! 		if (InterruptPendingEvent & ConflictErrorPending)
! 		{
! 			Assert(RecoveryInProgress());
! 			msg = "canceling statement due to conflict with recovery";
  		}
+ 		else
+ 			msg = "canceling idle transaction due to administrator request";
+ 
+ 		ZeroInterruptEventPending();
+ 
+ 		ereport(NOTICE,
+ 				(errcode(ERRCODE_QUERY_CANCELED),
+ 				 errmsg(msg)));
+ 
+ 		AbortTransactionAndAnySubtransactions();
+ 
+ 		set_ps_display("idle in transaction (aborted)", false);
+ 		pgstat_report_activity("<IDLE> in transaction (aborted)");
  	}
! 	/* If we get here, do nothing (probably, CancelQueryPending was reset) */
  }
  
  
***************
*** 3507,3518 ****
  		HOLD_INTERRUPTS();
  
  		/*
! 		 * Forget any pending QueryCancel request, since we're returning to
  		 * the idle loop anyway, and cancel the statement timer if running.
  		 */
! 		QueryCancelPending = false;
  		disable_sig_alarm(true);
! 		QueryCancelPending = false;		/* again in case timeout occurred */
  
  		/*
  		 * Turn off these interrupts too.  This is only needed here and not in
--- 3506,3518 ----
  		HOLD_INTERRUPTS();
  
  		/*
! 		 * Forget any pending CancelQuery request, since we're returning to
  		 * the idle loop anyway, and cancel the statement timer if running.
  		 */
! 		ZeroInterruptEventPending();
  		disable_sig_alarm(true);
! 		/* again in case timeout occurred */
! 		ZeroInterruptEventPending();
  
  		/*
  		 * Turn off these interrupts too.  This is only needed here and not in
***************
*** 3603,3609 ****
  		 */
  		if (send_ready_for_query)
  		{
! 			if (IsTransactionOrTransactionBlock())
  			{
  				set_ps_display("idle in transaction", false);
  				pgstat_report_activity("<IDLE> in transaction");
--- 3603,3614 ----
  		 */
  		if (send_ready_for_query)
  		{
! 			if (IsAbortedTransactionBlockState())
! 			{
! 				set_ps_display("idle in transaction (aborted)", false);
! 				pgstat_report_activity("<IDLE> in transaction (aborted)");
! 			}
! 			else if (IsTransactionOrTransactionBlock())
  			{
  				set_ps_display("idle in transaction", false);
  				pgstat_report_activity("<IDLE> in transaction");
***************
*** 3626,3632 ****
  		 * conditional since we don't want, say, reads on behalf of COPY FROM
  		 * STDIN doing the same thing.)
  		 */
! 		QueryCancelPending = false;		/* forget any earlier CANCEL signal */
  		DoingCommandRead = true;
  
  		/*
--- 3631,3638 ----
  		 * conditional since we don't want, say, reads on behalf of COPY FROM
  		 * STDIN doing the same thing.)
  		 */
! 		/* forget any earlier cancel signals */
! 		ZeroInterruptEventPending();
  		DoingCommandRead = true;
  
  		/*
diff -cr cvs/src/backend/utils/adt/misc.c cvs.build/src/backend/utils/adt/misc.c
*** cvs/src/backend/utils/adt/misc.c	2009-12-09 11:24:45.000000000 +0100
--- cvs.build/src/backend/utils/adt/misc.c	2010-01-07 12:38:52.000000000 +0100
***************
*** 30,35 ****
--- 30,36 ----
  #include "storage/fd.h"
  #include "storage/pmsignal.h"
  #include "storage/procarray.h"
+ #include "storage/procsignal.h"
  #include "utils/builtins.h"
  #include "tcop/tcopprot.h"
  
***************
*** 117,122 ****
--- 118,132 ----
  }
  
  Datum
+ pg_cancel_idle_transaction(PG_FUNCTION_ARGS)
+ {
+ 	int     ret;
+ 	pid_t   pid = PG_GETARG_INT32(0);
+ 	ret = SendProcSignal(pid, PROCSIG_CANCEL_IDLE_TRANSACTION, InvalidBackendId);
+ 	PG_RETURN_BOOL(ret == 0);
+ }
+ 
+ Datum
  pg_reload_conf(PG_FUNCTION_ARGS)
  {
  	if (!superuser())
diff -cr cvs/src/backend/utils/init/globals.c cvs.build/src/backend/utils/init/globals.c
*** cvs/src/backend/utils/init/globals.c	2009-12-09 11:24:42.000000000 +0100
--- cvs.build/src/backend/utils/init/globals.c	2010-01-03 20:58:09.000000000 +0100
***************
*** 25,33 ****
  
  ProtocolVersion FrontendProtocol;
  
! volatile bool InterruptPending = false;
! volatile bool QueryCancelPending = false;
! volatile bool ProcDiePending = false;
  volatile bool ImmediateInterruptOK = false;
  volatile uint32 InterruptHoldoffCount = 0;
  volatile uint32 CritSectionCount = 0;
--- 25,31 ----
  
  ProtocolVersion FrontendProtocol;
  
! volatile int InterruptPendingEvent = NothingPending;
  volatile bool ImmediateInterruptOK = false;
  volatile uint32 InterruptHoldoffCount = 0;
  volatile uint32 CritSectionCount = 0;
diff -cr cvs/src/include/access/xact.h cvs.build/src/include/access/xact.h
*** cvs/src/include/access/xact.h	2009-12-24 13:55:28.000000000 +0100
--- cvs.build/src/include/access/xact.h	2010-01-04 08:01:52.000000000 +0100
***************
*** 189,194 ****
--- 189,195 ----
  extern void StartTransactionCommand(void);
  extern void CommitTransactionCommand(void);
  extern void AbortCurrentTransaction(void);
+ extern void AbortTransactionAndAnySubtransactions(void);
  extern void BeginTransactionBlock(void);
  extern bool EndTransactionBlock(void);
  extern bool PrepareTransactionBlock(char *gid);
diff -cr cvs/src/include/catalog/pg_proc.h cvs.build/src/include/catalog/pg_proc.h
*** cvs/src/include/catalog/pg_proc.h	2009-12-24 13:55:29.000000000 +0100
--- cvs.build/src/include/catalog/pg_proc.h	2010-01-04 07:45:01.000000000 +0100
***************
*** 3268,3273 ****
--- 3268,3275 ----
  
  DATA(insert OID = 2171 ( pg_cancel_backend		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "23" _null_ _null_ _null_ _null_ pg_cancel_backend _null_ _null_ _null_ ));
  DESCR("cancel a server process' current query");
+ DATA(insert OID = 2179 ( pg_cancel_idle_transaction		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "23" _null_ _null_ _null_ _null_ pg_cancel_idle_transaction _null_ _null_ _null_ ));
+ DESCR("cancel a server process' idle transaction");
  DATA(insert OID = 2096 ( pg_terminate_backend		PGNSP PGUID 12 1 0 0 f f f t f v 1 0 16 "23" _null_ _null_ _null_ _null_ pg_terminate_backend _null_ _null_ _null_ ));
  DESCR("terminate a server process");
  DATA(insert OID = 2172 ( pg_start_backup		PGNSP PGUID 12 1 0 0 f f f t f v 2 0 25 "25 16" _null_ _null_ _null_ _null_ pg_start_backup _null_ _null_ _null_ ));
diff -cr cvs/src/include/miscadmin.h cvs.build/src/include/miscadmin.h
*** cvs/src/include/miscadmin.h	2009-12-24 13:55:28.000000000 +0100
--- cvs.build/src/include/miscadmin.h	2010-01-06 21:13:35.000000000 +0100
***************
*** 64,74 ****
   *
   *****************************************************************************/
  
  /* in globals.c */
! /* these are marked volatile because they are set by signal handlers: */
! extern PGDLLIMPORT volatile bool InterruptPending;
! extern volatile bool QueryCancelPending;
! extern volatile bool ProcDiePending;
  
  /* these are marked volatile because they are examined by signal handlers: */
  extern volatile bool ImmediateInterruptOK;
--- 64,85 ----
   *
   *****************************************************************************/
  
+ #define NothingPending						0
+ #define CancelQueryPending					1
+ #define CancelIdleTransactionPending		2
+ #define CancelQueryOrTransactionPending		4
+ #define CancelSessionPending				8
+ #define ConflictErrorPending				16
+ #define ConflictFatalPending				32
+ 
+ #define IsInterruptEventPending		   (InterruptPendingEvent != NothingPending)
+ #define SetInterruptEventPending(ev)   (InterruptPendingEvent |= ev)
+ #define ClearInterruptEventPending(ev) (InterruptPendingEvent &= ~ev)
+ #define ZeroInterruptEventPending()	   (InterruptPendingEvent = NothingPending)
+ 
  /* in globals.c */
! /* this is marked volatile because it is set by signal handlers: */
! extern PGDLLIMPORT volatile int InterruptPendingEvent;
  
  /* these are marked volatile because they are examined by signal handlers: */
  extern volatile bool ImmediateInterruptOK;
***************
*** 82,88 ****
  
  #define CHECK_FOR_INTERRUPTS() \
  do { \
! 	if (InterruptPending) \
  		ProcessInterrupts(); \
  } while(0)
  #else							/* WIN32 */
--- 93,99 ----
  
  #define CHECK_FOR_INTERRUPTS() \
  do { \
! 	if (IsInterruptEventPending) \
  		ProcessInterrupts(); \
  } while(0)
  #else							/* WIN32 */
***************
*** 91,97 ****
  do { \
  	if (UNBLOCKED_SIGNAL_QUEUE()) \
  		pgwin32_dispatch_queued_signals(); \
! 	if (InterruptPending) \
  		ProcessInterrupts(); \
  } while(0)
  #endif   /* WIN32 */
--- 102,108 ----
  do { \
  	if (UNBLOCKED_SIGNAL_QUEUE()) \
  		pgwin32_dispatch_queued_signals(); \
! 	if (IsInterruptEventPending) \
  		ProcessInterrupts(); \
  } while(0)
  #endif   /* WIN32 */
diff -cr cvs/src/include/storage/procarray.h cvs.build/src/include/storage/procarray.h
*** cvs/src/include/storage/procarray.h	2009-12-24 13:55:33.000000000 +0100
--- cvs.build/src/include/storage/procarray.h	2010-01-06 19:36:15.000000000 +0100
***************
*** 15,20 ****
--- 15,21 ----
  #define PROCARRAY_H
  
  #include "storage/lock.h"
+ #include "storage/procsignal.h"
  #include "storage/standby.h"
  #include "utils/snapshot.h"
  
***************
*** 59,65 ****
  extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
  					Oid dbOid, bool skipExistingConflicts);
  extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid,
! 						 int cancel_mode);
  
  extern int	CountActiveBackends(void);
  extern int	CountDBBackends(Oid databaseid);
--- 60,66 ----
  extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
  					Oid dbOid, bool skipExistingConflicts);
  extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid,
! 									  ProcSignalReason sigmode);
  
  extern int	CountActiveBackends(void);
  extern int	CountDBBackends(Oid databaseid);
diff -cr cvs/src/include/storage/procsignal.h cvs.build/src/include/storage/procsignal.h
*** cvs/src/include/storage/procsignal.h	2009-12-09 11:24:53.000000000 +0100
--- cvs.build/src/include/storage/procsignal.h	2010-01-07 00:34:26.000000000 +0100
***************
*** 32,40 ****
--- 32,51 ----
  	PROCSIG_CATCHUP_INTERRUPT,	/* sinval catchup interrupt */
  	PROCSIG_NOTIFY_INTERRUPT,	/* listen/notify interrupt */
  
+ 	IDX_PROCSIG_HARMLESS,		/* not a real identifier but an index.
+ 								 * we treat harmless signals different than
+ 								 * harmful ones. */
+ 
+ 	PROCSIG_CANCEL_IDLE_TRANSACTION, /* pg_backend_cancel_idle_transaction() */
+ 
+ 	PROCSIG_CONFLICT_ERROR_INTERRUPT, /* used by HotStandby for conflict */
+ 	PROCSIG_CONFLICT_FATAL_INTERRUPT, /* resolution.					 */
+ 
  	NUM_PROCSIGNALS				/* Must be last! */
  } ProcSignalReason;
  
+ #define IsHarmlessSignal(reason) (reason < IDX_PROCSIG_HARMLESS)
+ 
  /*
   * prototypes for functions in procsignal.c
   */
diff -cr cvs/src/include/tcop/tcopprot.h cvs.build/src/include/tcop/tcopprot.h
*** cvs/src/include/tcop/tcopprot.h	2009-12-09 11:24:53.000000000 +0100
--- cvs.build/src/include/tcop/tcopprot.h	2010-01-07 00:22:38.000000000 +0100
***************
*** 64,69 ****
--- 64,74 ----
  extern void quickdie(SIGNAL_ARGS);
  extern void StatementCancelHandler(SIGNAL_ARGS);
  extern void FloatExceptionHandler(SIGNAL_ARGS);
+ 
+ extern void HandleCancelAction(bool cancelQuery,
+ 							   bool cancelIdleTransaction,
+ 							   bool cancelSession);
+ 
  extern void prepare_for_client_read(void);
  extern void client_read_ended(void);
  extern const char *process_postgres_switches(int argc, char *argv[],
diff -cr cvs/src/include/utils/builtins.h cvs.build/src/include/utils/builtins.h
*** cvs/src/include/utils/builtins.h	2009-12-24 13:55:33.000000000 +0100
--- cvs.build/src/include/utils/builtins.h	2010-01-06 21:52:12.000000000 +0100
***************
*** 443,448 ****
--- 443,449 ----
  extern Datum current_database(PG_FUNCTION_ARGS);
  extern Datum current_query(PG_FUNCTION_ARGS);
  extern Datum pg_cancel_backend(PG_FUNCTION_ARGS);
+ extern Datum pg_cancel_idle_transaction(PG_FUNCTION_ARGS);
  extern Datum pg_terminate_backend(PG_FUNCTION_ARGS);
  extern Datum pg_reload_conf(PG_FUNCTION_ARGS);
  extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS);
#20Simon Riggs
simon@2ndQuadrant.com
In reply to: Joachim Wieland (#19)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2010-01-07 at 14:45 +0100, Joachim Wieland wrote:

On Thu, Dec 31, 2009 at 6:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Building racy infrastructure when it can be avoided with a little care
still seems not to be the best path to me.

Doing that will add more complexity in an area that is hard to test
effectively. I think the risk of introducing further bugs while trying
to fix this rare condition is high. Right now the conflict processing
needs more work and is often much less precise than this, so improving
this aspect of it would not be a priority. I've added it to the TODO
though. Thank you for your research.

Patch implements recovery conflict signalling using SIGUSR1
multiplexing, then uses a SessionCancelPending mode similar to Joachim's
TransactionCancelPending.

I have reworked Simon's patch a bit and attach the result.

Oh dear, this is exactly what I've been working on...

Quick facts:

- Hot Standby only uses SIGUSR1
- SIGINT behaves as it did before: it only cancels running statements
- pg_cancel_backend() continues to use SIGINT
- I added pg_cancel_idle_transaction() to cancel an idle transaction via
SIGUSR1
- One central function HandleCancelAction() sets the flags before calling
ProcessInterrupts(), it is called from the different signal handlers and
receives parameters about what it should do
- If a SIGUSR1 reason is used that will cancel something, ProcArrayLock is
acquired until the signal has been sent to make sure that we won't signal the
wrong backend. Does this sufficiently cover the concerns of Andres Freund
discussed upthread?

As there were so many boolean SomethingCancelPending variables I changed them
to be bitmasks and merged all of them into a single variable. However I am not
sure if we can change the name and type of especially InterruptPending that
easily as it was marked with PGDLLIMPORT...

@Simon: Is there a reason why you have not yet removed recoveryConflictMode
from PGPROC?

--
Simon Riggs www.2ndQuadrant.com

#21Andres Freund
andres@anarazel.de
In reply to: Joachim Wieland (#19)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 07 January 2010 14:45:55 Joachim Wieland wrote:

On Thu, Dec 31, 2009 at 6:40 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Building racy infrastructure when it can be avoided with a little care
still seems not to be the best path to me.

Doing that will add more complexity in an area that is hard to test
effectively. I think the risk of introducing further bugs while trying
to fix this rare condition is high. Right now the conflict processing
needs more work and is often much less precise than this, so improving
this aspect of it would not be a priority. I've added it to the TODO
though. Thank you for your research.

Patch implements recovery conflict signalling using SIGUSR1
multiplexing, then uses a SessionCancelPending mode similar to Joachim's
TransactionCancelPending.

I have reworked Simon's patch a bit and attach the result.

Quick facts:

- Hot Standby only uses SIGUSR1
- SIGINT behaves as it did before: it only cancels running statements
- pg_cancel_backend() continues to use SIGINT
- I added pg_cancel_idle_transaction() to cancel an idle transaction via
SIGUSR1
- One central function HandleCancelAction() sets the flags before calling
ProcessInterrupts(), it is called from the different signal handlers and
receives parameters about what it should do
- If a SIGUSR1 reason is used that will cancel something, ProcArrayLock is
acquired until the signal has been sent to make sure that we won't signal
the wrong backend. Does this sufficiently cover the concerns of Andres
Freund discussed upthread?

I think it solves the major concern (which I btw could easily reproduce using
software that is in production) but unfortunately not completely.
The avoided situation is:

C(Client): BEGIN; SELECT WHATEVER;
S(Standby): conflict with C
S: Starting to cancel C
C: COMMIT
S: Sending Signal to C
C: Wrong transaction is aborted

The situation not avoided is:
C: BEGIN; SELECT ...
S: conflict with C, lock procarray, sending signal(thats asynchronous), unlock
procarray
C: COMMIT; BEGIN
C: Signal arrives
C: Wrong txn is killled

It should be easy to fix this by having a cancel_localTransactionId field in the
procarray which gets cleaned uppon transaction/backend start and gets checked
in the signal handler (should be casted to sig_atomic_t)

Will cookup a patch if nobody speaks against something like that.

Andres

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joachim Wieland (#19)
Re: Hot Standy introduced problem with query cancel behavior

Joachim Wieland <joe@mcknight.de> writes:

As there were so many boolean SomethingCancelPending variables I changed them
to be bitmasks and merged all of them into a single variable.

This seems like a truly horrid idea, because those variables are set by
signal handlers. A bitmask cannot be manipulated atomically, so you
have almost certainly introduced race-condition bugs.

regards, tom lane

#23Joachim Wieland
joe@mcknight.de
In reply to: Tom Lane (#22)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, Jan 7, 2010 at 4:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Joachim Wieland <joe@mcknight.de> writes:

As there were so many boolean SomethingCancelPending variables I changed them
to be bitmasks and merged all of them into a single variable.

This seems like a truly horrid idea, because those variables are set by
signal handlers.  A bitmask cannot be manipulated atomically, so you
have almost certainly introduced race-condition bugs.

True... However there are just a few places where the patch uses
bitmasks for modification of the variable. As Simon seems to be
working on this currently anyway, I'll leave it to him to either keep
the 5 boolean variables or do some mutual exclusion as in
HandleNotifyInterrupt() (or do something completely different).

Joachim

#24Joachim Wieland
joe@mcknight.de
In reply to: Simon Riggs (#20)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, Jan 7, 2010 at 3:03 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, 2010-01-07 at 14:45 +0100, Joachim Wieland wrote:

I have reworked Simon's patch a bit and attach the result.

Oh dear, this is exactly what I've been working on...

Sorry, as you have posted a first patch some days ago I thought you
were waiting for feedback... Just tried to help ;-)

Joachim

#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Joachim Wieland (#24)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2010-01-07 at 17:50 +0100, Joachim Wieland wrote:

On Thu, Jan 7, 2010 at 3:03 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, 2010-01-07 at 14:45 +0100, Joachim Wieland wrote:

I have reworked Simon's patch a bit and attach the result.

Oh dear, this is exactly what I've been working on...

Sorry, as you have posted a first patch some days ago I thought you
were waiting for feedback... Just tried to help ;-)

Well, you have been very helpful, its just this last little part that is
duplicated.

--
Simon Riggs www.2ndQuadrant.com

#26Simon Riggs
simon@2ndQuadrant.com
In reply to: Joachim Wieland (#19)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2010-01-07 at 14:45 +0100, Joachim Wieland wrote:

@Simon: Is there a reason why you have not yet removed recoveryConflictMode
from PGPROC?

Unfortunately we still need a mechanism to mark which backends have been
cancelled already. Transaction state for virtual transactions isn't
visible on the procarray, so we need something there to indicate that a
backend has been sent a conflict. Otherwise we'd end up waiting for it
endlessly. The name will be changing though.

--
Simon Riggs www.2ndQuadrant.com

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#26)
Re: Hot Standy introduced problem with query cancel behavior

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2010-01-07 at 14:45 +0100, Joachim Wieland wrote:

@Simon: Is there a reason why you have not yet removed recoveryConflictMode
from PGPROC?

Unfortunately we still need a mechanism to mark which backends have been
cancelled already. Transaction state for virtual transactions isn't
visible on the procarray, so we need something there to indicate that a
backend has been sent a conflict. Otherwise we'd end up waiting for it
endlessly. The name will be changing though.

While we're discussing this: the current coding with
AbortOutOfAnyTransaction within ProcessInterrupts is *utterly* unsafe.
I realize that's just a toy placeholder, but getting rid of it has to be
on the list of stop-ship items. Right at the moment I'd prefer to see
CONFLICT_MODE_ERROR always turned into CONFLICT_MODE_FATAL than to
imagine this is going to work.

regards, tom lane

#28Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#27)
1 attachment(s)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2010-01-07 at 12:14 -0500, Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2010-01-07 at 14:45 +0100, Joachim Wieland wrote:

@Simon: Is there a reason why you have not yet removed recoveryConflictMode
from PGPROC?

Unfortunately we still need a mechanism to mark which backends have been
cancelled already. Transaction state for virtual transactions isn't
visible on the procarray, so we need something there to indicate that a
backend has been sent a conflict. Otherwise we'd end up waiting for it
endlessly. The name will be changing though.

While we're discussing this: the current coding with
AbortOutOfAnyTransaction within ProcessInterrupts is *utterly* unsafe.
I realize that's just a toy placeholder, but getting rid of it has to be
on the list of stop-ship items. Right at the moment I'd prefer to see
CONFLICT_MODE_ERROR always turned into CONFLICT_MODE_FATAL than to
imagine this is going to work.

Hmmm. Can you check my current attempt? This may suffer this problem.

If, so can you explain a little more for me? Thanks.

--
Simon Riggs www.2ndQuadrant.com

Attachments:

another_recovery_cancel.patchtext/x-patch; charset=UTF-8; name=another_recovery_cancel.patchDownload
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index ea40420..e05792e 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -313,8 +313,7 @@ IsTransactionState(void)
 /*
  *	IsAbortedTransactionBlockState
  *
- *	This returns true if we are currently running a query
- *	within an aborted transaction block.
+ *	This returns true if we are within an aborted transaction block.
  */
 bool
 IsAbortedTransactionBlockState(void)
@@ -2692,6 +2691,48 @@ AbortCurrentTransaction(void)
 }
 
 /*
+ *	AbortTransactionAndAnySubtransactions
+ *
+ * Similar to AbortCurrentTransaction but if any subtransactions
+ * in progress we abort them *and* all of their parents. So this is
+ * used when the caller wishes to make the abort untrappable by the user.
+ * After this has run IsAbortedTransactionBlockState() will be true.
+ */
+void
+AbortTransactionAndAnySubtransactions(void)
+{
+	TransactionState s = CurrentTransactionState;
+
+	switch (s->blockState)
+	{
+		case TBLOCK_DEFAULT:
+		case TBLOCK_STARTED:
+		case TBLOCK_BEGIN:
+		case TBLOCK_INPROGRESS:
+		case TBLOCK_END:
+		case TBLOCK_ABORT:
+		case TBLOCK_SUBABORT:
+		case TBLOCK_ABORT_END:
+		case TBLOCK_ABORT_PENDING:
+		case TBLOCK_PREPARE:
+		case TBLOCK_SUBABORT_END:
+		case TBLOCK_SUBABORT_RESTART:
+			AbortCurrentTransaction();
+			break;
+
+		case TBLOCK_SUBINPROGRESS:
+		case TBLOCK_SUBBEGIN:
+		case TBLOCK_SUBEND:
+		case TBLOCK_SUBABORT_PENDING:
+		case TBLOCK_SUBRESTART:
+			AbortSubTransaction();
+			CleanupSubTransaction();
+			AbortTransactionAndAnySubtransactions();
+			break;
+	}
+}
+
+/*
  *	PreventTransactionChain
  *
  *	This routine is to be called by statements that must not run inside
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 85f14f6..e2e64dd 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -324,6 +324,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		/* must be cleared with xid/xmin: */
 		proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
 		proc->inCommit = false; /* be sure this is cleared in abort */
+		proc->recoveryConflictPending = false;
 
 		/* Clear the subtransaction-XID cache too while holding the lock */
 		proc->subxids.nxids = 0;
@@ -350,6 +351,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		/* must be cleared with xid/xmin: */
 		proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
 		proc->inCommit = false; /* be sure this is cleared in abort */
+		proc->recoveryConflictPending = false;
 
 		Assert(proc->subxids.nxids == 0);
 		Assert(proc->subxids.overflowed == false);
@@ -377,7 +379,7 @@ ProcArrayClearTransaction(PGPROC *proc)
 	proc->xid = InvalidTransactionId;
 	proc->lxid = InvalidLocalTransactionId;
 	proc->xmin = InvalidTransactionId;
-	proc->recoveryConflictMode = 0;
+	proc->recoveryConflictPending = false;
 
 	/* redundant, but just in case */
 	proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
@@ -1665,7 +1667,7 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid,
 		if (proc->pid == 0)
 			continue;
 
-		if (skipExistingConflicts && proc->recoveryConflictMode > 0)
+		if (skipExistingConflicts && proc->recoveryConflictPending)
 			continue;
 
 		if (!OidIsValid(dbOid) ||
@@ -1704,7 +1706,7 @@ GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid,
  * Returns pid of the process signaled, or 0 if not found.
  */
 pid_t
-CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
+CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode)
 {
 	ProcArrayStruct *arrayP = procArray;
 	int			index;
@@ -1722,28 +1724,22 @@ CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
 		if (procvxid.backendId == vxid.backendId &&
 			procvxid.localTransactionId == vxid.localTransactionId)
 		{
-			/*
-			 * Issue orders for the proc to read next time it receives SIGINT
-			 */
-			if (proc->recoveryConflictMode < cancel_mode)
-				proc->recoveryConflictMode = cancel_mode;
-
+			proc->recoveryConflictPending = true;
 			pid = proc->pid;
+			if (pid != 0)
+			{
+				/*
+				 * Kill the pid if it's still here. If not, that's what we wanted
+				 * so ignore any errors.
+				 */
+				(void) SendProcSignal(pid, sigmode, vxid.backendId);
+			}
 			break;
 		}
 	}
 
 	LWLockRelease(ProcArrayLock);
 
-	if (pid != 0)
-	{
-		/*
-		 * Kill the pid if it's still here. If not, that's what we wanted
-		 * so ignore any errors.
-		 */
-		kill(pid, SIGINT);
-	}
-
 	return pid;
 }
 
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 480af00..c4007fe 100644
--- a/src/backend/storage/ipc/procsignal.c
+++ b/src/backend/storage/ipc/procsignal.c
@@ -24,6 +24,8 @@
 #include "storage/procsignal.h"
 #include "storage/shmem.h"
 #include "storage/sinval.h"
+#include "storage/standby.h"
+#include "tcop/tcopprot.h"
 
 
 /*
@@ -258,5 +260,11 @@ procsignal_sigusr1_handler(SIGNAL_ARGS)
 	if (CheckProcSignal(PROCSIG_NOTIFY_INTERRUPT))
 		HandleNotifyInterrupt();
 
+	if (CheckProcSignal(PROCSIG_CONFLICT_ERROR_INTERRUPT))
+		RecoveryConflictInterrupt(CONFLICT_MODE_ERROR);
+
+	if (CheckProcSignal(PROCSIG_CONFLICT_FATAL_INTERRUPT))
+		RecoveryConflictInterrupt(CONFLICT_MODE_FATAL);
+
 	errno = save_errno;
 }
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 0b4e022..0ea7a32 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -159,8 +159,6 @@ WaitExceedsMaxStandbyDelay(void)
  * recovery processing. Judgement has already been passed on it within
  * a specific rmgr. Here we just issue the orders to the procs. The procs
  * then throw the required error as instructed.
- *
- * We may ask for a specific cancel_mode, typically ERROR or FATAL.
  */
 void
 ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
@@ -218,12 +216,16 @@ ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 			if (WaitExceedsMaxStandbyDelay())
 			{
 				pid_t pid;
+				ProcSignalReason	sigmode = PROCSIG_CONFLICT_ERROR_INTERRUPT;
+
+				if (cancel_mode == CONFLICT_MODE_FATAL)
+					sigmode = PROCSIG_CONFLICT_FATAL_INTERRUPT;
 
 				/*
 				 * Now find out who to throw out of the balloon.
 				 */
 				Assert(VirtualTransactionIdIsValid(*waitlist));
-				pid = CancelVirtualTransaction(*waitlist, cancel_mode);
+				pid = CancelVirtualTransaction(*waitlist, sigmode);
 
 				if (pid != 0)
 				{
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 352ff08..a708369 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -318,7 +318,7 @@ InitProcess(void)
 	MyProc->waitProcLock = NULL;
 	for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
 		SHMQueueInit(&(MyProc->myProcLocks[i]));
-	MyProc->recoveryConflictMode = 0;
+	MyProc->recoveryConflictPending = false;
 
 	/*
 	 * We might be reusing a semaphore that belonged to a failed process. So
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 3bddf49..3ebde79 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -172,6 +172,9 @@ static int	UseNewLine = 1;		/* Use newlines query delimiters (the default) */
 static int	UseNewLine = 0;		/* Use EOF as query delimiters */
 #endif   /* TCOP_DONTUSENEWLINE */
 
+/* whether we were cancelled during recovery by conflict processing or not */
+static bool RecoveryConflictPending = false; 
+static bool TransactionCancelPending = false;
 
 /* ----------------------------------------------------------------
  *		decls for routines only used in this file
@@ -185,6 +188,7 @@ static List *pg_rewrite_query(Query *query);
 static bool check_log_statement(List *stmt_list);
 static int	errdetail_execute(List *raw_parsetree_list);
 static int	errdetail_params(ParamListInfo params);
+static int  errdetail_abort(void);
 static void start_xact_command(void);
 static void finish_xact_command(void);
 static bool IsTransactionExitStmt(Node *parsetree);
@@ -945,7 +949,8 @@ exec_simple_query(const char *query_string)
 			ereport(ERROR,
 					(errcode(ERRCODE_IN_FAILED_SQL_TRANSACTION),
 					 errmsg("current transaction is aborted, "
-						"commands ignored until end of transaction block")));
+							"commands ignored until end of transaction block"),
+					 errdetail_abort()));
 
 		/* Make sure we are in a transaction command */
 		start_xact_command();
@@ -1254,7 +1259,8 @@ exec_parse_message(const char *query_string,	/* string to execute */
 			ereport(ERROR,
 					(errcode(ERRCODE_IN_FAILED_SQL_TRANSACTION),
 					 errmsg("current transaction is aborted, "
-						"commands ignored until end of transaction block")));
+							"commands ignored until end of transaction block"),
+					 errdetail_abort()));
 
 		/*
 		 * Set up a snapshot if parse analysis/planning will need one.
@@ -1534,7 +1540,8 @@ exec_bind_message(StringInfo input_message)
 		ereport(ERROR,
 				(errcode(ERRCODE_IN_FAILED_SQL_TRANSACTION),
 				 errmsg("current transaction is aborted, "
-						"commands ignored until end of transaction block")));
+						"commands ignored until end of transaction block"),
+				 errdetail_abort()));
 
 	/*
 	 * Create the portal.  Allow silent replacement of an existing portal only
@@ -1975,7 +1982,8 @@ exec_execute_message(const char *portal_name, long max_rows)
 		ereport(ERROR,
 				(errcode(ERRCODE_IN_FAILED_SQL_TRANSACTION),
 				 errmsg("current transaction is aborted, "
-						"commands ignored until end of transaction block")));
+						"commands ignored until end of transaction block"),
+				 errdetail_abort()));
 
 	/* Check for cancel signal before we start execution */
 	CHECK_FOR_INTERRUPTS();
@@ -2236,6 +2244,20 @@ errdetail_params(ParamListInfo params)
 }
 
 /*
+ * errdetail_abort
+ *
+ * Add an errdetail() line showing abort reason, if any.
+ */
+static int
+errdetail_abort(void)
+{
+	if (MyProc->recoveryConflictPending)
+		errdetail("abort reason: recovery conflict");
+
+	return 0;
+}
+
+/*
  * exec_describe_statement_message
  *
  * Process a "Describe" message for a prepared statement
@@ -2292,7 +2314,8 @@ exec_describe_statement_message(const char *stmt_name)
 		ereport(ERROR,
 				(errcode(ERRCODE_IN_FAILED_SQL_TRANSACTION),
 				 errmsg("current transaction is aborted, "
-						"commands ignored until end of transaction block")));
+						"commands ignored until end of transaction block"),
+				 errdetail_abort()));
 
 	if (whereToSendOutput != DestRemote)
 		return;					/* can't actually do anything... */
@@ -2372,7 +2395,8 @@ exec_describe_portal_message(const char *portal_name)
 		ereport(ERROR,
 				(errcode(ERRCODE_IN_FAILED_SQL_TRANSACTION),
 				 errmsg("current transaction is aborted, "
-						"commands ignored until end of transaction block")));
+						"commands ignored until end of transaction block"),
+				 errdetail_abort()));
 
 	if (whereToSendOutput != DestRemote)
 		return;					/* can't actually do anything... */
@@ -2643,9 +2667,59 @@ StatementCancelHandler(SIGNAL_ARGS)
 		 * If it's safe to interrupt, and we're waiting for a lock, service
 		 * the interrupt immediately.  No point in interrupting if we're
 		 * waiting for input, however.
+		*/
+		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
+				CritSectionCount == 0 && !DoingCommandRead)
+		{
+			/* bump holdoff count to make ProcessInterrupts() a no-op */
+			/* until we are done getting ready for it */
+			InterruptHoldoffCount++;
+			LockWaitCancel();	/* prevent CheckDeadLock from running */
+			DisableNotifyInterrupt();
+			DisableCatchupInterrupt();
+			InterruptHoldoffCount--;
+			ProcessInterrupts();
+		}
+	}
+
+	errno = save_errno;
+}
+
+void
+RecoveryConflictInterrupt(int conflict_mode)
+{
+	int			save_errno = errno;
+
+	/*
+	 * Don't joggle the elbow of proc_exit
+	 */
+	if (!proc_exit_inprogress)
+	{
+		switch (conflict_mode)
+		{
+			case CONFLICT_MODE_FATAL:
+				ProcDiePending = true;
+				break;
+
+			case CONFLICT_MODE_ERROR:
+				if (IsAbortedTransactionBlockState())
+					return;
+				TransactionCancelPending = true;
+				break;
+
+			default:
+				elog(ERROR, "Unknown conflict mode");
+		}
+
+		RecoveryConflictPending = true;
+		InterruptPending = true;
+
+		/*
+		 * If it's safe to interrupt, and we're waiting for input or a lock,
+		 * service the interrupt immediately. Same as in die()
 		 */
-		if (InterruptHoldoffCount == 0 && CritSectionCount == 0 &&
-			(DoingCommandRead || ImmediateInterruptOK))
+		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
+			CritSectionCount == 0)
 		{
 			/* bump holdoff count to make ProcessInterrupts() a no-op */
 			/* until we are done getting ready for it */
@@ -2709,6 +2783,10 @@ ProcessInterrupts(void)
 			ereport(FATAL,
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
 					 errmsg("terminating autovacuum process due to administrator command")));
+		else if (RecoveryConflictPending)
+			ereport(NOTICE,
+					(errcode(ERRCODE_ADMIN_SHUTDOWN),
+					 errmsg("terminating connection due to conflict with recovery")));
 		else
 			ereport(FATAL,
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
@@ -2735,59 +2813,42 @@ ProcessInterrupts(void)
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling autovacuum task")));
-		else
-		{
-			int cancelMode = MyProc->recoveryConflictMode;
-
-			/*
-			 * XXXHS: We don't yet have a clean way to cancel an
-			 * idle-in-transaction session, so make it FATAL instead.
-			 * This isn't as bad as it looks because we don't issue a
-			 * CONFLICT_MODE_ERROR for a session with proc->xmin == 0
-			 * on cleanup conflicts. There's a possibility that we
-			 * marked somebody as a conflict and then they go idle.
-			 */
-			if (DoingCommandRead && IsTransactionBlock() &&
-				cancelMode == CONFLICT_MODE_ERROR)
-			{
-				cancelMode = CONFLICT_MODE_FATAL;
-			}
-
-			switch (cancelMode)
-			{
-				case CONFLICT_MODE_FATAL:
-						Assert(RecoveryInProgress());
-						ereport(FATAL,
-							(errcode(ERRCODE_QUERY_CANCELED),
-							 errmsg("canceling session due to conflict with recovery")));
-
-				case CONFLICT_MODE_ERROR:
-						/*
-						 * We are aborting because we need to release
-						 * locks. So we need to abort out of all
-						 * subtransactions to make sure we release
-						 * all locks at whatever their level.
-						 *
-						 * XXX Should we try to examine the
-						 * transaction tree and cancel just enough
-						 * subxacts to remove locks? Doubt it.
-						 */
-						Assert(RecoveryInProgress());
-						AbortOutOfAnyTransaction();
-						ereport(ERROR,
-							(errcode(ERRCODE_QUERY_CANCELED),
-							 errmsg("canceling statement due to conflict with recovery")));
-						return;
-
-				default:
-						/* No conflict pending, so fall through */
-						break;
-			}
-
+		else 
 			ereport(ERROR,
 					(errcode(ERRCODE_QUERY_CANCELED),
 					 errmsg("canceling statement due to user request")));
-		}
+	}
+	if (TransactionCancelPending)
+	{
+		TransactionCancelPending = false;
+		ImmediateInterruptOK = false;	/* not idle anymore */
+
+		/*
+		 * Cancel the transaction whether or not it was idle, so don't mention 
+		 * the word idle in the message.
+		 */
+		if (RecoveryConflictPending)
+			ereport(NOTICE,
+					(errcode(ERRCODE_QUERY_CANCELED),
+					 errmsg("canceling transaction due to conflict with recovery")));
+		else
+			ereport(NOTICE,
+					(errcode(ERRCODE_QUERY_CANCELED),
+					 errmsg("canceling transaction due to administrator request")));
+
+		/* 
+		 * Indicate transaction aborted by recovery so we can use the appropriate
+		 * error message when we enter the aborted state.
+		 */
+		AbortTransactionAndAnySubtransactions();
+		RecoveryConflictPending = false;
+
+		/*
+		 * Set the display correctly now, rather than wait for client wait loop
+		 * to cycle round and reset display at some later time.
+		 */
+		set_ps_display("idle in transaction (aborted)", false);
+		pgstat_report_activity("<IDLE> in transaction (aborted)");
 	}
 	/* If we get here, do nothing (probably, QueryCancelPending was reset) */
 }
@@ -3603,7 +3664,12 @@ PostgresMain(int argc, char *argv[], const char *username)
 		 */
 		if (send_ready_for_query)
 		{
-			if (IsTransactionOrTransactionBlock())
+			if (IsAbortedTransactionBlockState())
+			{
+				set_ps_display("idle in transaction (aborted)", false);
+				pgstat_report_activity("<IDLE> in transaction (aborted)");
+			}
+			else if (IsTransactionOrTransactionBlock())
 			{
 				set_ps_display("idle in transaction", false);
 				pgstat_report_activity("<IDLE> in transaction");
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 6eee8ca..89481a3 100644
--- a/src/include/access/xact.h
+++ b/src/include/access/xact.h
@@ -204,6 +204,7 @@ extern bool IsTransactionBlock(void);
 extern bool IsTransactionOrTransactionBlock(void);
 extern char TransactionBlockStatusCode(void);
 extern void AbortOutOfAnyTransaction(void);
+extern void AbortTransactionAndAnySubtransactions(void);
 extern void PreventTransactionChain(bool isTopLevel, const char *stmtType);
 extern void RequireTransactionChain(bool isTopLevel, const char *stmtType);
 extern bool IsInTransactionChain(bool isTopLevel);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index 2298bf2..8105d39 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -96,11 +96,11 @@ struct PGPROC
 	uint8		vacuumFlags;	/* vacuum-related flags, see above */
 
 	/*
-	 * While in hot standby mode, setting recoveryConflictMode instructs
-	 * the backend to commit suicide. Possible values are the same as those
-	 * passed to ResolveRecoveryConflictWithVirtualXIDs().
+	 * While in hot standby mode, shows that a conflict signal has been sent
+	 * for the current transaction. Set/cleared while holding ProcArrayLock,
+	 * though not required. Accessed without lock, if needed.
 	 */
-	int			recoveryConflictMode;
+	bool		recoveryConflictPending;
 
 	/* Info about LWLock the process is currently waiting for, if any. */
 	bool		lwWaiting;		/* true if waiting for an LW lock */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 1cee639..ec358b1 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -15,6 +15,7 @@
 #define PROCARRAY_H
 
 #include "storage/lock.h"
+#include "storage/procsignal.h"
 #include "storage/standby.h"
 #include "utils/snapshot.h"
 
@@ -58,8 +59,7 @@ extern VirtualTransactionId *GetCurrentVirtualXIDs(TransactionId limitXmin,
 					  int *nvxids);
 extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
 					Oid dbOid, bool skipExistingConflicts);
-extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid,
-						 int cancel_mode);
+extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid, ProcSignalReason sigmode);
 
 extern int	CountActiveBackends(void);
 extern int	CountDBBackends(Oid databaseid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 7f3b65e..59a2f97 100644
--- a/src/include/storage/procsignal.h
+++ b/src/include/storage/procsignal.h
@@ -31,6 +31,8 @@ typedef enum
 {
 	PROCSIG_CATCHUP_INTERRUPT,	/* sinval catchup interrupt */
 	PROCSIG_NOTIFY_INTERRUPT,	/* listen/notify interrupt */
+	PROCSIG_CONFLICT_ERROR_INTERRUPT,	/* recovery conflict error */
+	PROCSIG_CONFLICT_FATAL_INTERRUPT,	/* recovery conflict fatal */
 
 	NUM_PROCSIGNALS				/* Must be last! */
 } ProcSignalReason;
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 38df3d7..d6a7f0c 100644
--- a/src/include/storage/standby.h
+++ b/src/include/storage/standby.h
@@ -26,6 +26,7 @@ extern int	vacuum_defer_cleanup_age;
 
 extern void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
 									   char *reason, int cancel_mode);
+extern void HandleConflictInterrupt(int conflict_mode);
 
 extern void InitRecoveryTransactionEnvironment(void);
 extern void ShutdownRecoveryTransactionEnvironment(void);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index d7ede2b..fc5ab9e 100644
--- a/src/include/tcop/tcopprot.h
+++ b/src/include/tcop/tcopprot.h
@@ -63,6 +63,7 @@ extern bool assign_max_stack_depth(int newval, bool doit, GucSource source);
 extern void die(SIGNAL_ARGS);
 extern void quickdie(SIGNAL_ARGS);
 extern void StatementCancelHandler(SIGNAL_ARGS);
+extern void RecoveryConflictInterrupt(int conflict_mode);
 extern void FloatExceptionHandler(SIGNAL_ARGS);
 extern void prepare_for_client_read(void);
 extern void client_read_ended(void);
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#28)
Re: Hot Standy introduced problem with query cancel behavior

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2010-01-07 at 12:14 -0500, Tom Lane wrote:

While we're discussing this: the current coding with
AbortOutOfAnyTransaction within ProcessInterrupts is *utterly* unsafe.
I realize that's just a toy placeholder, but getting rid of it has to be
on the list of stop-ship items. Right at the moment I'd prefer to see
CONFLICT_MODE_ERROR always turned into CONFLICT_MODE_FATAL than to
imagine this is going to work.

Hmmm. Can you check my current attempt? This may suffer this problem.

This looks like a mess. You've duplicated a whole lot of code and not
fixed the fundamental problem.

If, so can you explain a little more for me? Thanks.

You can not do this from inside an interrupt service routine. Period.
No amount of deck-chair-rearrangement will fix that.

As far as I can think at the moment, the best you can do is throw the
elog(ERROR), and if control gets out to the error recovery block in
PostgresMain, you can force a transaction abort there. In other words,
pretty much the same logic that was there before; the only addition that
I think is safe is to allow this to happen while DoingCommandRead, so
that you can cancel an idle transaction.

Now of course the problem with this approach, if you choose to see it as
a problem, is that somebody could trap the error and try to continue
processing. The only way you can positively guarantee that the backend
will give up whatever locks it's holding is if you elog(FATAL) instead
of trying to do normal error processing. So maybe the right thing is to
forget about CONFLICT_MODE_ERROR altogether. How critical is it that an
HS-requested query cancel be any more likely to do anything than a
regular query cancel is?

regards, tom lane

#30Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#29)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 07 January 2010 19:12:31 Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On Thu, 2010-01-07 at 12:14 -0500, Tom Lane wrote:

While we're discussing this: the current coding with
AbortOutOfAnyTransaction within ProcessInterrupts is *utterly* unsafe.
I realize that's just a toy placeholder, but getting rid of it has to be
on the list of stop-ship items. Right at the moment I'd prefer to see
CONFLICT_MODE_ERROR always turned into CONFLICT_MODE_FATAL than to
imagine this is going to work.

Hmmm. Can you check my current attempt? This may suffer this problem.

This looks like a mess. You've duplicated a whole lot of code and not
fixed the fundamental problem.

If, so can you explain a little more for me? Thanks.

You can not do this from inside an interrupt service routine. Period.
No amount of deck-chair-rearrangement will fix that.

As far as I can think at the moment, the best you can do is throw the
elog(ERROR), and if control gets out to the error recovery block in
PostgresMain, you can force a transaction abort there. In other words,
pretty much the same logic that was there before; the only addition that
I think is safe is to allow this to happen while DoingCommandRead, so
that you can cancel an idle transaction.

Stupid question:

Why dont we add CHECK_FOR_INTERRUPTS (or something similar) to everything
calling recv (especially in the EINTR) case?
That way we can simply abort in the normal context without the constraint of
an interrupt handler because recv() will finish after having serviced a signal
handler.

Sure there will be some parts calling CHECK_FOR_INTERRUPTS not often enough
but thats surely not that critical - and after some time using a bit more
force is ok I guess.

Andres

#31Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#29)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2010-01-07 at 13:12 -0500, Tom Lane wrote:

not fixed the fundamental problem.

(I wasn't saying that I had, only wanted to confirm the problem still
existed in the version I was currently working on.)

How critical is it that an
HS-requested query cancel be any more likely to do anything than a
regular query cancel is?

Recovery needs to cancel backends in two main cases, which currently
throw CONFLICT_MODE_ERROR, though could be separated:

(1) Cancel because a snapshot might fail to see data that has now been
removed and so get an inconsistent result. We need to abort any
subtransactions that have open snapshots, which is really the whole top
level xact, unless anyone has a good plan of how to identify

(2) Cancel because a backend holds a lock on a table that has been
AccessExclusiveLocked on primary. We need to abort as far up the
transaction stack as it takes to remove the conflicting lock. We
currently abort the whole transaction.

Both of those have cases where we might need to abort further than just
the top-level subtransaction. We might be able to identify the cases
where aborting only the current subtrans is OK, and then just throw
normal ERROR for those cases. I think the cases are

* when no subxacts exist

or

* for (1): if no open portals in still active parent sub-transactions
* for (2): if any conflicting locks are held by current subxact only

--
Simon Riggs www.2ndQuadrant.com

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#30)
Re: Hot Standy introduced problem with query cancel behavior

Andres Freund <andres@anarazel.de> writes:

Stupid question:

Why dont we add CHECK_FOR_INTERRUPTS (or something similar) to everything
calling recv (especially in the EINTR) case?

We pretty much have CHECK_FOR_INTERRUPTS everywhere that it's safe
already. The problem here is not a lack of execution of
CHECK_FOR_INTERRUPTS, but what to do inside it. Although I pointed to
an interrupt service routine as being the worst case, the fact is that
Simon's code will crash the system in a large number of scenarios even
when ProcessInterrupts is called from CHECK_FOR_INTERRUPTS rather than
directly from the signal handler. You can't just abort transactions and
then return to a nest of functions that think they still have a live
transaction they can resume. This is why the standard error code path
has the AbortTransaction call out in the sigjmp catcher, not inside
elog() itself.

regards, tom lane

#33Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#30)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, 2010-01-07 at 19:23 +0100, Andres Freund wrote:

On Thursday 07 January 2010 19:12:31 Tom Lane wrote:

As far as I can think at the moment, the best you can do is throw the
elog(ERROR), and if control gets out to the error recovery block in
PostgresMain, you can force a transaction abort there. In other words,
pretty much the same logic that was there before; the only addition that
I think is safe is to allow this to happen while DoingCommandRead, so
that you can cancel an idle transaction.

Stupid question:

Why dont we add CHECK_FOR_INTERRUPTS (or something similar) to everything
calling recv (especially in the EINTR) case?
That way we can simply abort in the normal context without the constraint of
an interrupt handler because recv() will finish after having serviced a signal
handler.

Sure there will be some parts calling CHECK_FOR_INTERRUPTS not often enough
but thats surely not that critical - and after some time using a bit more
force is ok I guess.

There's only a need for an immediate death when we were waiting for a
lock. In other cases a deferred death is possible. We could do that in
various places, such as each time we access a new data block.

--
Simon Riggs www.2ndQuadrant.com

#34Joachim Wieland
joe@mcknight.de
In reply to: Tom Lane (#29)
Re: Hot Standy introduced problem with query cancel behavior

On Thu, Jan 7, 2010 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

As far as I can think at the moment, the best you can do is throw the
elog(ERROR), and if control gets out to the error recovery block in
PostgresMain, you can force a transaction abort there.  In other words,
pretty much the same logic that was there before; the only addition that
I think is safe is to allow this to happen while DoingCommandRead, so
that you can cancel an idle transaction.

Sorry, but to be clear about this, what do you mean with "allow this
to happen"? You mean that while DoingCommandRead it should be safe to
abort the transaction even from the signal handler or only from the
sigjmp catcher?

Now of course the problem with this approach, if you choose to see it as
a problem, is that somebody could trap the error and try to continue
processing.  The only way you can positively guarantee that the backend
will give up whatever locks it's holding is if you elog(FATAL) instead
of trying to do normal error processing.  So maybe the right thing is to
forget about CONFLICT_MODE_ERROR altogether.  How critical is it that an
HS-requested query cancel be any more likely to do anything than a
regular query cancel is?

Simon, couldn't you just translate the conflict modes to the other
cancel modes depending on DoingCommandRead (which is to be determined
from within ProcessInterrupts(), not before).

CONFLICT_MODE_ERROR && !DoingCommandRead => cancel running query
(QueryCancelPending)
CONFLICT_MODE_ERROR && DoingCommandRead => cancel idle transaction
CONFLICT_MODE_FATAL => cancel session (ProcDiePending)

Joachim

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joachim Wieland (#34)
Re: Hot Standy introduced problem with query cancel behavior

Joachim Wieland <joe@mcknight.de> writes:

On Thu, Jan 7, 2010 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

As far as I can think at the moment, the best you can do is throw the
elog(ERROR), and if control gets out to the error recovery block in
PostgresMain, you can force a transaction abort there. �In other words,
pretty much the same logic that was there before; the only addition that
I think is safe is to allow this to happen while DoingCommandRead, so
that you can cancel an idle transaction.

Sorry, but to be clear about this, what do you mean with "allow this
to happen"? You mean that while DoingCommandRead it should be safe to
abort the transaction even from the signal handler or only from the
sigjmp catcher?

In the previous coding, nothing at all happened if DoingCommandRead.
What I am suggesting (and what should be actually safe after my fixes
a couple hours ago) is that it is okay to allow a query-cancel error
to be thrown while in DoingCommandRead state. (Of course there are
still nasty implications for the FE/BE protocol, but at least you won't
crash the backend by doing so.) What is not, and never will be, safe
is to do any sort of transaction-aborting work inside ProcessInterrupts.
You can either throw a regular elog(ERROR) and hope that subsequent
transaction abort will do what you want, or throw elog(FATAL) and let
the cleanup happen during proc_exit. The reason the second form is safe
is that control won't ever return to whatever it was you interrupted,
and so any expectations that such code might have about being able to
recover from the error and continue its processing won't matter.

regards, tom lane

#36Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#32)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 07 January 2010 19:49:59 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

Stupid question:

Why dont we add CHECK_FOR_INTERRUPTS (or something similar) to everything
calling recv (especially in the EINTR) case?

We pretty much have CHECK_FOR_INTERRUPTS everywhere that it's safe
already. The problem here is not a lack of execution of
CHECK_FOR_INTERRUPTS, but what to do inside it. Although I pointed to
an interrupt service routine as being the worst case, the fact is that
Simon's code will crash the system in a large number of scenarios even
when ProcessInterrupts is called from CHECK_FOR_INTERRUPTS rather than
directly from the signal handler.

I did not want to suggest using Simons code there. Sorry for the brevity.

The reason I suggested adding CHECK_FOR_INTERRUPTS into the recv code path was
that it should allow a relatively "natural" handling of canceling "IDLE IN
TRANSACTION" queries without doing anything in the interrupt handler.

I think it shouldn't be to hard to make that code path safe for
CHECK_FOR_INTERRUPTS().

I personally don't think its important to be that nice to a non-cooperating
backend (i.e. one catching our ERRORs) so I dont see any problems in just
FATAL'ing it after a couple of tries (the code retries already so...).

Andres

#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#36)
Re: Hot Standy introduced problem with query cancel behavior

Andres Freund <andres@anarazel.de> writes:

The reason I suggested adding CHECK_FOR_INTERRUPTS into the recv code path was
that it should allow a relatively "natural" handling of canceling "IDLE IN
TRANSACTION" queries without doing anything in the interrupt handler.

I think it shouldn't be to hard to make that code path safe for
CHECK_FOR_INTERRUPTS().

Idle in transaction isn't the problem (except for what it does to the
FE/BE protocol state). The problem is what happens inside a non-idle
transaction.

Since apparently I'm still not being clear enough about this, let me
spell it out:

1. Outer transaction calls, say, a plperl function.
2. plperl function executes some query via SPI, thereby starting
a subtransaction.
3. We receive an HS query-cancel interrupt. Since
!ImmediateInterruptOK, this just sets QueryCancelPending.
4. At the next occurrence of CHECK_FOR_INTERRUPTS, ProcessInterrupts
is entered.
5. According to both Simon's committed patch and his recent variant,
ProcessInterrupts executes AbortOutOfAnyTransaction and then throws
elog(ERROR).
6. plperl.c catches the elog longjmp and tries to abort its
subtransaction (loss #1), then return to the Perl interpreter
which is under no obligation to abort processing its perl script
(loss #2), and whenever it does exit, or else call SPI to try to
process another query, we're screwed because the outer transaction
is already dead (loss #3).

The situation with Perl or Python or some other PL is pretty much
the worst case, since we have no control whatever over that code
layer --- but in reality this type of scenario can play out even
without any third-party code involved. Anyplace that catches an
elog longjmp will be broken by AbortOutOfAnyTransaction inside
ProcessInterrupts, because things aren't supposed to happen in that
order.

regards, tom lane

#38Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#37)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 07 January 2010 21:47:47 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

The reason I suggested adding CHECK_FOR_INTERRUPTS into the recv code
path was that it should allow a relatively "natural" handling of
canceling "IDLE IN TRANSACTION" queries without doing anything in the
interrupt handler.

I think it shouldn't be to hard to make that code path safe for
CHECK_FOR_INTERRUPTS().

Idle in transaction isn't the problem (except for what it does to the
FE/BE protocol state). The problem is what happens inside a non-idle
transaction.

Since apparently I'm still not being clear enough about this, let me
spell it out:
5. According to both Simon's committed patch and his recent variant,
ProcessInterrupts executes AbortOutOfAnyTransaction and then throws
elog(ERROR).

Andres Freund <andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.

should have read as "revert to old code and add two step killing (thats likely
around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if nothing
happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

I have no doubt about you being right that its not practical (even more so at
this point in the release cycle) to make that AbortOutOfAnyTransaction call.

Andres

PS: Thats procarray.c: ResolveRecoveryConflictWithVirtualXIDs

#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#38)
Re: Hot Standy introduced problem with query cancel behavior

Andres Freund <andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.
should have read as "revert to old code and add two step killing (thats likely
around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if nothing
happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

Ah. This loop happens in the process that's trying to send the cancel
signal, correct, not the one that needs to respond to it? That sounds
fairly sane to me.

There are some things we could do to make it more likely that a cancel
of this type is accepted --- for instance, give it a distinct SQLSTATE
code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
is no practical way to guarantee it except elog(FATAL). I'm not
entirely convinced that an untrappable error would be a good thing
anyway; it's hard to argue that that's much better than a FATAL.

regards, tom lane

#40Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#39)
Re: Hot Standy introduced problem with query cancel behavior

On Thursday 07 January 2010 22:28:46 Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.
should have read as "revert to old code and add two step killing (thats
likely around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if
nothing happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

Ah. This loop happens in the process that's trying to send the cancel
signal, correct, not the one that needs to respond to it? That sounds
fairly sane to me.

Yes.

There are some things we could do to make it more likely that a cancel
of this type is accepted --- for instance, give it a distinct SQLSTATE
code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
is no practical way to guarantee it except elog(FATAL). I'm not
entirely convinced that an untrappable error would be a good thing
anyway; it's hard to argue that that's much better than a FATAL.

Well a session which is usable after a transaction abort is quite sensible -
quite some software I know handles a failing transaction much more gracefully
than a session abort (e.g. because it has to deal with serialization failures
and such anyway).

So making it cought by fewer places and degrading to FATAL sound sensible and
relatively easy to me.
Unless somebody disagrees I will give it a shot.

@Simon: Could you possibly push your current HS state to the git repo? That
would make it easier to see whats already applied and such. That would be
really nice.

Andres

#41Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#40)
1 attachment(s)
Re: Hot Standy introduced problem with query cancel behavior

On 01/07/10 22:37, Andres Freund wrote:

On Thursday 07 January 2010 22:28:46 Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.
should have read as "revert to old code and add two step killing (thats
likely around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if
nothing happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

Ah. This loop happens in the process that's trying to send the cancel
signal, correct, not the one that needs to respond to it? That sounds
fairly sane to me.

Yes.

There are some things we could do to make it more likely that a cancel
of this type is accepted --- for instance, give it a distinct SQLSTATE
code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
is no practical way to guarantee it except elog(FATAL). I'm not
entirely convinced that an untrappable error would be a good thing
anyway; it's hard to argue that that's much better than a FATAL.

Well a session which is usable after a transaction abort is quite sensible -
quite some software I know handles a failing transaction much more gracefully
than a session abort (e.g. because it has to deal with serialization failures
and such anyway).

So making it cought by fewer places and degrading to FATAL sound sensible and
relatively easy to me.
Unless somebody disagrees I will give it a shot.

Ok, here is a stab at that:

1. Allow the signal multiplexing facility to transfer one sig_atomic_t
worth of data. This is usefull e.g. for making operations idempotent
or more precise.

In this the LocalBackendId is transported - that should make it
impossible to cancel the wrong transaction

Use the signal multiplexing facility.

2.

AbortTransactionAndAnySubtransactions is only used in the mainloops
error handler as it should be unproblematic there.

In the current CVS code ConditionalVirtualXactLockTableWait() in
ResolveRecoveryConflictWithVirtualXIDs does the wait for every try of
cancelling the other transaction.

I moved the retry logic into CancelVirtualTransaction(). If 50 times a
ERROR does nothing it degrades to FATAL

XXX: I temporarily do not use the skipExistingConflicts argument of
GetConflictingVirtualXIDs - I dont understand its purpose and a bit of
infrastructure is missing right now as the recoveryConflictMode is not
stored anymore anywhere. Can easily be added back though.

3.
Add a new error code ERRCODE_QUERY_CANCELED_HS for use with HS
indicating a failure that is more than a plain
ERRCODE_QUERY_CANCELED - namely it should not be caught from
various places like savepoints and in PLs.

Exemplarily I checked for that error code in plpgsql and make it
uncatcheable.

I am not sure how new errorcode codes get chosen though - and the name
is not that nice.

Opinions on that?

I copied quite a bit from Simons earlier patch...

---

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Andres

PS: My current MUA suffers from a wronggone upgrade currently, so no
idea how that message will appear.

Attachments:

hs-query-cancel.patchtext/x-diff; name=hs-query-cancel.patchDownload
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 90e7b4a..a6e2405 100644
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
*************** AbortOutOfAnyTransaction(void)
*** 3712,3717 ****
--- 3712,3761 ----
  }
  
  /*
+  *	AbortTransactionAndAnySubtransactions
+  *
+  * Similar to AbortCurrentTransaction but if any subtransactions
+  * in progress we abort them *and* all of their parents. So this is
+  * used when the caller wishes to make the abort untrappable by the user.
+  * After this has run IsAbortedTransactionBlockState() will be true.
+  */
+ void
+ AbortTransactionAndAnySubtransactions(void)
+ {
+ 	while(true){
+ 		TransactionState s = CurrentTransactionState;
+ 
+ 		switch (s->blockState)
+ 		{
+ 			case TBLOCK_DEFAULT:
+ 			case TBLOCK_STARTED:
+ 			case TBLOCK_BEGIN:
+ 			case TBLOCK_INPROGRESS:
+ 			case TBLOCK_END:
+ 			case TBLOCK_ABORT:
+ 			case TBLOCK_SUBABORT:
+ 			case TBLOCK_ABORT_END:
+ 			case TBLOCK_ABORT_PENDING:
+ 			case TBLOCK_PREPARE:
+ 			case TBLOCK_SUBABORT_END:
+ 			case TBLOCK_SUBABORT_RESTART:
+ 				AbortCurrentTransaction();
+ 				return;
+ 				break;
+ 
+ 			case TBLOCK_SUBINPROGRESS:
+ 			case TBLOCK_SUBBEGIN:
+ 			case TBLOCK_SUBEND:
+ 			case TBLOCK_SUBABORT_PENDING:
+ 			case TBLOCK_SUBRESTART:
+ 				AbortSubTransaction();
+ 				CleanupSubTransaction();
+ 				break;
+ 		}
+ 	}
+ }
+ 
+ /*
   * IsTransactionBlock --- are we within a transaction block?
   */
  bool
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 9cb28f7..d8ea470 100644
*** a/src/backend/storage/ipc/procarray.c
--- b/src/backend/storage/ipc/procarray.c
***************
*** 51,56 ****
--- 51,57 ----
  #include "access/xact.h"
  #include "access/twophase.h"
  #include "miscadmin.h"
+ #include "storage/lmgr.h"
  #include "storage/procarray.h"
  #include "storage/standby.h"
  #include "utils/builtins.h"
*************** ProcArrayClearTransaction(PGPROC *proc)
*** 377,383 ****
  	proc->xid = InvalidTransactionId;
  	proc->lxid = InvalidLocalTransactionId;
  	proc->xmin = InvalidTransactionId;
- 	proc->recoveryConflictMode = 0;
  
  	/* redundant, but just in case */
  	proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
--- 378,383 ----
*************** GetConflictingVirtualXIDs(TransactionId 
*** 1665,1672 ****
  		if (proc->pid == 0)
  			continue;
  
! 		if (skipExistingConflicts && proc->recoveryConflictMode > 0)
! 			continue;
  
  		if (!OidIsValid(dbOid) ||
  			proc->databaseId == dbOid)
--- 1665,1677 ----
  		if (proc->pid == 0)
  			continue;
  
! 		/*
! 		 * XXX: doesn't skipExistingConflicts have the danger of
! 		 * shadowing an FATAL error?
! 		 * I.e. ERROR is signalled first and then FATAL is signalled...
! 		 * if (skipExistingConflicts && ...)
! 		 *   continue;
! 		 */
  
  		if (!OidIsValid(dbOid) ||
  			proc->databaseId == dbOid)
*************** GetConflictingVirtualXIDs(TransactionId 
*** 1704,1717 ****
   * Returns pid of the process signaled, or 0 if not found.
   */
  pid_t
! CancelVirtualTransaction(VirtualTransactionId vxid, int cancel_mode)
  {
  	ProcArrayStruct *arrayP = procArray;
  	int			index;
  	pid_t		pid = 0;
  
! 	LWLockAcquire(ProcArrayLock, LW_SHARED);
  
  	for (index = 0; index < arrayP->numProcs; index++)
  	{
  		VirtualTransactionId procvxid;
--- 1709,1728 ----
   * Returns pid of the process signaled, or 0 if not found.
   */
  pid_t
! CancelVirtualTransaction(VirtualTransactionId vxid,
! 						 recovery_conflict_mode cancel_mode)
  {
  	ProcArrayStruct *arrayP = procArray;
  	int			index;
  	pid_t		pid = 0;
  
! 	ProcSignalReason  sigmode = PROCSIG_CONFLICT_ERROR_INTERRUPT;
! 	size_t num_locks = 0;
  
+ 	/*
+ 	 * Find pid
+ 	 */
+ 	LWLockAcquire(ProcArrayLock, LW_SHARED);
  	for (index = 0; index < arrayP->numProcs; index++)
  	{
  		VirtualTransactionId procvxid;
*************** CancelVirtualTransaction(VirtualTransact
*** 1722,1747 ****
  		if (procvxid.backendId == vxid.backendId &&
  			procvxid.localTransactionId == vxid.localTransactionId)
  		{
- 			/*
- 			 * Issue orders for the proc to read next time it receives SIGINT
- 			 */
- 			if (proc->recoveryConflictMode < cancel_mode)
- 				proc->recoveryConflictMode = cancel_mode;
- 
  			pid = proc->pid;
  			break;
  		}
  	}
- 
  	LWLockRelease(ProcArrayLock);
  
! 	if (pid != 0)
! 	{
  		/*
! 		 * Kill the pid if it's still here. If not, that's what we wanted
! 		 * so ignore any errors.
  		 */
! 		kill(pid, SIGINT);
  	}
  
  	return pid;
--- 1733,1782 ----
  		if (procvxid.backendId == vxid.backendId &&
  			procvxid.localTransactionId == vxid.localTransactionId)
  		{
  			pid = proc->pid;
  			break;
  		}
  	}
  	LWLockRelease(ProcArrayLock);
  
! 	if(!pid)
! 		return pid;
! 
! 	/*
! 	 * Kill the transaction. We retry 50 times - if we didnt finish
! 	 * the transaction till then, we start using FATAL.
! 	 *
! 	 * We dont need to lock while signalling as the messages contain
! 	 * only sig_atomic_t wide contents which itself identify the
! 	 * operation completely. I.e. it contains the local transaction
! 	 * id, so it can't abort the wrong transaction.
! 	 */
! 	if (cancel_mode == CONFLICT_MODE_FATAL)
! 		sigmode = PROCSIG_CONFLICT_FATAL_INTERRUPT;
! 
! 	while(!ConditionalVirtualXactLockTableWait(vxid)){
! 		if(sigmode == PROCSIG_CONFLICT_ERROR_INTERRUPT && num_locks++ > 50){
! 			sigmode = PROCSIG_CONFLICT_FATAL_INTERRUPT;
! 
! 			elog(trace_recovery(DEBUG1),
! 				 "recovery tried to cancel virtual transaction %u/%u pid %ld. No success so far. Using FATAL",
! 				 vxid.backendId,
! 				 vxid.localTransactionId,
! 				 (long) pid);
! 		}
! 
! 		if(sigmode == PROCSIG_CONFLICT_FATAL_INTERRUPT){
! 			SendProcSignal(pid, sigmode, vxid.backendId, pid);
! 		}
! 		else{
! 			SendProcSignal(pid, sigmode, vxid.backendId, vxid.localTransactionId);
! 		}
! 
  		/*
! 		 * Wait awhile for it to die so that we avoid flooding an
! 		 * unresponsive backend when system is heavily loaded.
  		 */
! 		pg_usleep(5000);
  	}
  
  	return pid;
*************** CancelDBBackends(Oid databaseid)
*** 1842,1851 ****
  	{
  		volatile PGPROC *proc = arrayP->procs[index];
  
  		if (proc->databaseId == databaseid)
  		{
! 			proc->recoveryConflictMode = CONFLICT_MODE_FATAL;
! 			kill(proc->pid, SIGINT);
  		}
  	}
  
--- 1877,1887 ----
  	{
  		volatile PGPROC *proc = arrayP->procs[index];
  
+ 
  		if (proc->databaseId == databaseid)
  		{
! 			SendProcSignal(proc->pid, PROCSIG_CONFLICT_FATAL_INTERRUPT,
! 						   InvalidBackendId, true);
  		}
  	}
  
diff --git a/src/backend/storage/ipc/procsignal.c b/src/backend/storage/ipc/procsignal.c
index 2892727..1fbb886 100644
*** a/src/backend/storage/ipc/procsignal.c
--- b/src/backend/storage/ipc/procsignal.c
***************
*** 24,29 ****
--- 24,31 ----
  #include "storage/procsignal.h"
  #include "storage/shmem.h"
  #include "storage/sinval.h"
+ #include "storage/standby.h"
+ #include "tcop/tcopprot.h"
  
  
  /*
*************** void
*** 255,260 ****
--- 257,276 ----
  procsignal_sigusr1_handler(SIGNAL_ARGS)
  {
  	int		save_errno = errno;
+ 	sig_atomic_t signal_data;
+ 
+ 	/*
+ 	 * We check the possible signals in decreasing order of
+ 	 * importance. For example if were FATALing the backend there is
+ 	 * no point in sending out NOTIFYs before that.
+ 	 * XXX: Possibly we should not calll the other handlers at all when
+ 	 * receiving FATAL
+ 	 */
+ 	if ((signal_data = CheckProcSignal(PROCSIG_CONFLICT_FATAL_INTERRUPT)))
+ 		RecoveryConflictInterrupt(CONFLICT_MODE_FATAL, signal_data);
+ 
+ 	if ((signal_data = CheckProcSignal(PROCSIG_CONFLICT_ERROR_INTERRUPT)))
+ 		RecoveryConflictInterrupt(CONFLICT_MODE_ERROR, signal_data);
  
  	if (CheckProcSignal(PROCSIG_CATCHUP_INTERRUPT))
  		HandleCatchupInterrupt();
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 227b9cc..614914d 100644
*** a/src/backend/storage/ipc/standby.c
--- b/src/backend/storage/ipc/standby.c
*************** WaitExceedsMaxStandbyDelay(void)
*** 164,170 ****
   */
  void
  ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
! 									   char *reason, int cancel_mode)
  {
  	char		waitactivitymsg[100];
  
--- 164,170 ----
   */
  void
  ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
! 									   char *reason, recovery_conflict_mode cancel_mode)
  {
  	char		waitactivitymsg[100];
  
*************** ResolveRecoveryConflictWithVirtualXIDs(V
*** 234,246 ****
  					{
  						case CONFLICT_MODE_FATAL:
  							elog(trace_recovery(DEBUG1),
! 								 "recovery disconnects session with pid %ld because of conflict with %s",
  								 (long) pid,
  								 reason);
  							break;
  						case CONFLICT_MODE_ERROR:
  							elog(trace_recovery(DEBUG1),
! 								 "recovery cancels virtual transaction %u/%u pid %ld because of conflict with %s",
  								 waitlist->backendId,
  								 waitlist->localTransactionId,
  								 (long) pid,
--- 234,246 ----
  					{
  						case CONFLICT_MODE_FATAL:
  							elog(trace_recovery(DEBUG1),
! 								 "recovery disconnected session with pid %ld because of conflict with %s",
  								 (long) pid,
  								 reason);
  							break;
  						case CONFLICT_MODE_ERROR:
  							elog(trace_recovery(DEBUG1),
! 								 "recovery canceled virtual transaction %u/%u pid %ld because of conflict with %s",
  								 waitlist->backendId,
  								 waitlist->localTransactionId,
  								 (long) pid,
*************** ResolveRecoveryConflictWithVirtualXIDs(V
*** 250,261 ****
  							/* No conflict pending, so fall through */
  							break;
  					}
- 
- 					/*
- 					 * Wait awhile for it to die so that we avoid flooding an
- 					 * unresponsive backend when system is heavily loaded.
- 					 */
- 					pg_usleep(5000);
  				}
  			}
  		}
--- 250,255 ----
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 7daa5e8..3d4bd10 100644
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
*************** InitProcess(void)
*** 318,324 ****
  	MyProc->waitProcLock = NULL;
  	for (i = 0; i < NUM_LOCK_PARTITIONS; i++)
  		SHMQueueInit(&(MyProc->myProcLocks[i]));
- 	MyProc->recoveryConflictMode = 0;
  
  	/*
  	 * We might be reusing a semaphore that belonged to a failed process. So
--- 318,323 ----
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index be20bf3..919b1a3 100644
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** static int	UseNewLine = 1;		/* Use newli
*** 172,177 ****
--- 172,178 ----
  static int	UseNewLine = 0;		/* Use EOF as query delimiters */
  #endif   /* TCOP_DONTUSENEWLINE */
  
+ static LocalTransactionId CanceledLocalTransaction = InvalidLocalTransactionId;
  
  /* ----------------------------------------------------------------
   *		decls for routines only used in this file
*************** StatementCancelHandler(SIGNAL_ARGS)
*** 2642,2648 ****
  		 * service the interrupt immediately
  		 */
  		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
! 			CritSectionCount == 0)
  		{
  			/* bump holdoff count to make ProcessInterrupts() a no-op */
  			/* until we are done getting ready for it */
--- 2643,2649 ----
  		 * service the interrupt immediately
  		 */
  		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
! 			CritSectionCount == 0 && !DoingCommandRead)
  		{
  			/* bump holdoff count to make ProcessInterrupts() a no-op */
  			/* until we are done getting ready for it */
*************** ProcessInterrupts(void)
*** 2711,2819 ****
  					(errcode(ERRCODE_ADMIN_SHUTDOWN),
  			 errmsg("terminating connection due to administrator command")));
  	}
  	if (QueryCancelPending)
  	{
  		QueryCancelPending = false;
  		if (ClientAuthInProgress)
- 		{
- 			ImmediateInterruptOK = false;	/* not idle anymore */
- 			DisableNotifyInterrupt();
- 			DisableCatchupInterrupt();
- 			/* As in quickdie, don't risk sending to client during auth */
- 			if (whereToSendOutput == DestRemote)
- 				whereToSendOutput = DestNone;
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling authentication due to timeout")));
! 		}
! 		if (cancel_from_timeout)
! 		{
! 			ImmediateInterruptOK = false;	/* not idle anymore */
! 			DisableNotifyInterrupt();
! 			DisableCatchupInterrupt();
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling statement due to statement timeout")));
! 		}
! 		if (IsAutoVacuumWorkerProcess())
! 		{
! 			ImmediateInterruptOK = false;	/* not idle anymore */
! 			DisableNotifyInterrupt();
! 			DisableCatchupInterrupt();
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling autovacuum task")));
! 		}
! 		{
! 			int cancelMode = MyProc->recoveryConflictMode;
  
  			/*
! 			 * XXXHS: We don't yet have a clean way to cancel an
! 			 * idle-in-transaction session, so make it FATAL instead.
! 			 * This isn't as bad as it looks because we don't issue a
! 			 * CONFLICT_MODE_ERROR for a session with proc->xmin == 0
! 			 * on cleanup conflicts. There's a possibility that we
! 			 * marked somebody as a conflict and then they go idle.
  			 */
! 			if (DoingCommandRead && IsTransactionBlock() &&
! 				cancelMode == CONFLICT_MODE_ERROR)
! 			{
! 				cancelMode = CONFLICT_MODE_FATAL;
  			}
  
- 			switch (cancelMode)
- 			{
- 				case CONFLICT_MODE_FATAL:
- 					ImmediateInterruptOK = false;	/* not idle anymore */
- 					DisableNotifyInterrupt();
- 					DisableCatchupInterrupt();
- 					Assert(RecoveryInProgress());
- 					ereport(FATAL,
- 							(errcode(ERRCODE_QUERY_CANCELED),
- 							 errmsg("canceling session due to conflict with recovery")));
  
! 				case CONFLICT_MODE_ERROR:
! 					/*
! 					 * We are aborting because we need to release
! 					 * locks. So we need to abort out of all
! 					 * subtransactions to make sure we release
! 					 * all locks at whatever their level.
! 					 *
! 					 * XXX Should we try to examine the
! 					 * transaction tree and cancel just enough
! 					 * subxacts to remove locks? Doubt it.
! 					 */
! 					ImmediateInterruptOK = false;	/* not idle anymore */
! 					DisableNotifyInterrupt();
! 					DisableCatchupInterrupt();
! 					Assert(RecoveryInProgress());
! 					AbortOutOfAnyTransaction();
! 					ereport(ERROR,
! 							(errcode(ERRCODE_QUERY_CANCELED),
! 							 errmsg("canceling statement due to conflict with recovery")));
  
! 				default:
! 					/* No conflict pending, so fall through */
! 					break;
! 			}
  		}
  
  		/*
! 		 * If we are reading a command from the client, just ignore the
! 		 * cancel request --- sending an extra error message won't
! 		 * accomplish anything.  Otherwise, go ahead and throw the error.
  		 */
! 		if (!DoingCommandRead)
  		{
! 			ImmediateInterruptOK = false;	/* not idle anymore */
  			DisableNotifyInterrupt();
  			DisableCatchupInterrupt();
! 			ereport(ERROR,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling statement due to user request")));
  		}
  	}
! 	/* If we get here, do nothing (probably, QueryCancelPending was reset) */
  }
  
  
--- 2712,2818 ----
  					(errcode(ERRCODE_ADMIN_SHUTDOWN),
  			 errmsg("terminating connection due to administrator command")));
  	}
+ 
  	if (QueryCancelPending)
  	{
  		QueryCancelPending = false;
+ 		ImmediateInterruptOK = false;	/* not idle anymore */
+ 		DisableNotifyInterrupt();
+ 		DisableCatchupInterrupt();
+ 		/* As in quickdie, don't risk sending to client during auth */
+ 		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
+ 			whereToSendOutput = DestNone;
  		if (ClientAuthInProgress)
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling authentication due to timeout")));
! 		else if (cancel_from_timeout)
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling statement due to statement timeout")));
! 		else if (IsAutoVacuumWorkerProcess())
  			ereport(ERROR,
  					(errcode(ERRCODE_QUERY_CANCELED),
  					 errmsg("canceling autovacuum task")));
! 		else
! 			ereport(ERROR,
! 					(errcode(ERRCODE_QUERY_CANCELED),
! 					 errmsg("canceling statement due to user request")));
! 	}
! 
  
+ 	{
+ 		LocalTransactionId current_canceled = CanceledLocalTransaction;
+ 
+ 		/*
+ 		 * To avoid cancelling the wrong transaction because the
+ 		 * normal transaction already finished and generated a new
+ 		 * error before the signal handler or ProcessInterrupts has
+ 		 * run we recheck the current LocalTransactionId.
+ 		 */
+ 
+ 		if(CanceledLocalTransaction != InvalidLocalTransactionId)
+ 		{
+ 			ImmediateInterruptOK = false;	/* not idle anymore */
  			/*
! 			 * Cancel the transaction whether or not it was idle, so don't mention
! 			 * the word idle in the message.
  			 */
! 			if (current_canceled == MyProc->lxid){
! 				ereport(ERROR,
! 						(errcode(ERRCODE_QUERY_CANCELED_HS),
! 						 errmsg("canceling transaction due to conflict with recovery")));
  			}
+ 		}
+ 	}
+ }
  
  
! void
! RecoveryConflictInterrupt(recovery_conflict_mode conflict_mode, sig_atomic_t signal_data)
! {
! 	int			save_errno = errno;
  
! 	/*
! 	 * Don't joggle the elbow of proc_exit
! 	 */
! 	if (!proc_exit_inprogress)
! 	{
! 		switch (conflict_mode)
! 		{
! 			case CONFLICT_MODE_FATAL:
! 				ProcDiePending = true;
! 				break;
! 
! 			case CONFLICT_MODE_ERROR:
! 				CanceledLocalTransaction = (LocalTransactionId)signal_data;
! 				break;
! 
! 			default:
! 				elog(ERROR, "Unknown conflict mode");
  		}
  
+ 		InterruptPending = true;
+ 
  		/*
! 		 * If it's safe to interrupt, and we're waiting for input or a lock,
! 		 * service the interrupt immediately. Same as in die()
  		 */
! 		if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
! 			CritSectionCount == 0)
  		{
! 			/* bump holdoff count to make ProcessInterrupts() a no-op */
! 			/* until we are done getting ready for it */
! 			InterruptHoldoffCount++;
! 			LockWaitCancel();	/* prevent CheckDeadLock from running */
  			DisableNotifyInterrupt();
  			DisableCatchupInterrupt();
! 			InterruptHoldoffCount--;
! 			ProcessInterrupts();
  		}
  	}
! 
! 	errno = save_errno;
  }
  
  
*************** PostgresMain(int argc, char *argv[], con
*** 3560,3568 ****
  		debug_query_string = NULL;
  
  		/*
! 		 * Abort the current transaction in order to recover.
  		 */
! 		AbortCurrentTransaction();
  
  		/*
  		 * Now return to normal top-level context and clear ErrorContext for
--- 3559,3593 ----
  		debug_query_string = NULL;
  
  		/*
! 		 * Abort the current transaction in order to recover. If were
! 		 * in HS this is the only safe point where we can abort not
! 		 * only the current transaction but also all transaction above
! 		 * the current as there exists no higher point to jump to and
! 		 * thus were not playing around with somebodys execution context.
! 		 *
! 		 * To avoid cancelling the wrong transaction because the
! 		 * normal transaction already finished and generated a new
! 		 * error before the signal handler has run we recheck the
! 		 * current LocalTransactionId.
! 		 *
! 		 * XXX: Possibly this should use the new error code for a
! 		 * transaction canceled by HS instead.
  		 */
! 		{
! 			LocalTransactionId current_canceled = CanceledLocalTransaction;
! 			if(current_canceled != InvalidLocalTransactionId &&
! 			   current_canceled == MyProc->lxid)
! 				AbortTransactionAndAnySubtransactions();
! 			else
! 				AbortCurrentTransaction();
! 		}
! 
! 		/*
! 		 * We cannot overwrite a newly canceled LocalTransactionId
! 		 * here because we would have to leave that block to start a
! 		 * new transaction.
! 		 */
! 		CanceledLocalTransaction = InvalidLocalTransactionId;
  
  		/*
  		 * Now return to normal top-level context and clear ErrorContext for
*************** PostgresMain(int argc, char *argv[], con
*** 3598,3603 ****
--- 3623,3630 ----
  
  	for (;;)
  	{
+ 		CanceledLocalTransaction = InvalidLocalTransactionId;
+ 
  		/*
  		 * At top of loop, reset extended-query-message flag, so that any
  		 * errors encountered in "idle" state don't provoke skip.
diff --git a/src/include/access/xact.h b/src/include/access/xact.h
index 4c67be5..47b325a 100644
*** a/src/include/access/xact.h
--- b/src/include/access/xact.h
*************** extern bool IsTransactionBlock(void);
*** 204,209 ****
--- 204,210 ----
  extern bool IsTransactionOrTransactionBlock(void);
  extern char TransactionBlockStatusCode(void);
  extern void AbortOutOfAnyTransaction(void);
+ extern void AbortTransactionAndAnySubtransactions(void);
  extern void PreventTransactionChain(bool isTopLevel, const char *stmtType);
  extern void RequireTransactionChain(bool isTopLevel, const char *stmtType);
  extern bool IsInTransactionChain(bool isTopLevel);
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index de0df3b..46c74ef 100644
*** a/src/include/storage/proc.h
--- b/src/include/storage/proc.h
*************** struct PGPROC
*** 95,107 ****
  
  	uint8		vacuumFlags;	/* vacuum-related flags, see above */
  
- 	/*
- 	 * While in hot standby mode, setting recoveryConflictMode instructs
- 	 * the backend to commit suicide. Possible values are the same as those
- 	 * passed to ResolveRecoveryConflictWithVirtualXIDs().
- 	 */
- 	int			recoveryConflictMode;
- 
  	/* Info about LWLock the process is currently waiting for, if any. */
  	bool		lwWaiting;		/* true if waiting for an LW lock */
  	bool		lwExclusive;	/* true if waiting for exclusive access */
--- 95,100 ----
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 314491d..868058b 100644
*** a/src/include/storage/procarray.h
--- b/src/include/storage/procarray.h
***************
*** 17,22 ****
--- 17,23 ----
  #include "storage/lock.h"
  #include "storage/standby.h"
  #include "utils/snapshot.h"
+ #include "storage/procsignal.h"
  
  
  extern Size ProcArrayShmemSize(void);
*************** extern VirtualTransactionId *GetCurrentV
*** 59,65 ****
  extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
  					Oid dbOid, bool skipExistingConflicts);
  extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid,
! 						 int cancel_mode);
  
  extern int	CountActiveBackends(void);
  extern int	CountDBBackends(Oid databaseid);
--- 60,66 ----
  extern VirtualTransactionId *GetConflictingVirtualXIDs(TransactionId limitXmin,
  					Oid dbOid, bool skipExistingConflicts);
  extern pid_t CancelVirtualTransaction(VirtualTransactionId vxid,
! 									  recovery_conflict_mode cancel_mode);
  
  extern int	CountActiveBackends(void);
  extern int	CountDBBackends(Oid databaseid);
diff --git a/src/include/storage/procsignal.h b/src/include/storage/procsignal.h
index 3b0d56d..795346c 100644
*** a/src/include/storage/procsignal.h
--- b/src/include/storage/procsignal.h
***************
*** 30,35 ****
--- 30,38 ----
   */
  typedef enum
  {
+ 	PROCSIG_CONFLICT_FATAL_INTERRUPT,	/* recovery conflict fatal */
+ 	PROCSIG_CONFLICT_ERROR_INTERRUPT,	/* recovery conflict error */
+ 
  	PROCSIG_CATCHUP_INTERRUPT,	/* sinval catchup interrupt */
  	PROCSIG_NOTIFY_INTERRUPT,	/* listen/notify interrupt */
  
diff --git a/src/include/storage/standby.h b/src/include/storage/standby.h
index 7a1c41f..ac950b7 100644
*** a/src/include/storage/standby.h
--- b/src/include/storage/standby.h
***************
*** 20,31 ****
  extern int	vacuum_defer_cleanup_age;
  
  /* cancel modes for ResolveRecoveryConflictWithVirtualXIDs */
! #define CONFLICT_MODE_NOT_SET		0
! #define CONFLICT_MODE_ERROR			1	/* Conflict can be resolved by canceling query */
! #define CONFLICT_MODE_FATAL			2	/* Conflict can only be resolved by disconnecting session */
  
  extern void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
! 									   char *reason, int cancel_mode);
  
  extern void InitRecoveryTransactionEnvironment(void);
  extern void ShutdownRecoveryTransactionEnvironment(void);
--- 20,33 ----
  extern int	vacuum_defer_cleanup_age;
  
  /* cancel modes for ResolveRecoveryConflictWithVirtualXIDs */
! typedef enum {
! 	CONFLICT_MODE_NOT_SET, /* No conflict */
! 	CONFLICT_MODE_FATAL, /* Conflict can only be resolved by disconnecting session */
! 	CONFLICT_MODE_ERROR /* Conflict can be resolved by canceling query */
! } recovery_conflict_mode;
  
  extern void ResolveRecoveryConflictWithVirtualXIDs(VirtualTransactionId *waitlist,
! 									   char *reason, recovery_conflict_mode cancel_mode);
  
  extern void InitRecoveryTransactionEnvironment(void);
  extern void ShutdownRecoveryTransactionEnvironment(void);
diff --git a/src/include/tcop/tcopprot.h b/src/include/tcop/tcopprot.h
index 1298f7d..b901b6a 100644
*** a/src/include/tcop/tcopprot.h
--- b/src/include/tcop/tcopprot.h
***************
*** 19,27 ****
--- 19,30 ----
  #ifndef TCOPPROT_H
  #define TCOPPROT_H
  
+ #include <signal.h>
  #include "executor/execdesc.h"
  #include "nodes/parsenodes.h"
  #include "utils/guc.h"
+ #include "storage/standby.h"
+ 
  
  
  /* Required daylight between max_stack_depth and the kernel limit, in bytes */
*************** extern bool assign_max_stack_depth(int n
*** 63,68 ****
--- 66,73 ----
  extern void die(SIGNAL_ARGS);
  extern void quickdie(SIGNAL_ARGS);
  extern void StatementCancelHandler(SIGNAL_ARGS);
+ 
+ extern void RecoveryConflictInterrupt(recovery_conflict_mode conflict_mode, sig_atomic_t signal_data);
  extern void FloatExceptionHandler(SIGNAL_ARGS);
  extern void prepare_for_client_read(void);
  extern void client_read_ended(void);
diff --git a/src/include/utils/errcodes.h b/src/include/utils/errcodes.h
index 52c09ca..279f0e4 100644
*** a/src/include/utils/errcodes.h
--- b/src/include/utils/errcodes.h
***************
*** 328,333 ****
--- 328,334 ----
  /* Class 57 - Operator Intervention (class borrowed from DB2) */
  #define ERRCODE_OPERATOR_INTERVENTION		MAKE_SQLSTATE('5','7', '0','0','0')
  #define ERRCODE_QUERY_CANCELED				MAKE_SQLSTATE('5','7', '0','1','4')
+ #define ERRCODE_QUERY_CANCELED_HS			MAKE_SQLSTATE('5','7', '0','1','5')
  #define ERRCODE_ADMIN_SHUTDOWN				MAKE_SQLSTATE('5','7', 'P','0','1')
  #define ERRCODE_CRASH_SHUTDOWN				MAKE_SQLSTATE('5','7', 'P','0','2')
  #define ERRCODE_CANNOT_CONNECT_NOW			MAKE_SQLSTATE('5','7', 'P','0','3')
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index b9ca54f..810ef8c 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*************** exception_matches_conditions(ErrorData *
*** 897,903 ****
  		 * OTHERS matches everything *except* query-canceled; if you're
  		 * foolish enough, you can match that explicitly.
  		 */
! 		if (sqlerrstate == 0)
  		{
  			if (edata->sqlerrcode != ERRCODE_QUERY_CANCELED)
  				return true;
--- 897,905 ----
  		 * OTHERS matches everything *except* query-canceled; if you're
  		 * foolish enough, you can match that explicitly.
  		 */
! 		if (edata->sqlerrcode == ERRCODE_QUERY_CANCELED_HS)
! 			;
! 		else if (sqlerrstate == 0)
  		{
  			if (edata->sqlerrcode != ERRCODE_QUERY_CANCELED)
  				return true;
#42Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#40)
Re: Hot Standy introduced problem with query cancel behavior

On 01/07/10 22:37, Andres Freund wrote:

On Thursday 07 January 2010 22:28:46 Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.
should have read as "revert to old code and add two step killing (thats
likely around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if
nothing happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

Ah. This loop happens in the process that's trying to send the cancel
signal, correct, not the one that needs to respond to it? That sounds
fairly sane to me.

Yes.

There are some things we could do to make it more likely that a cancel
of this type is accepted --- for instance, give it a distinct SQLSTATE
code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
is no practical way to guarantee it except elog(FATAL). I'm not
entirely convinced that an untrappable error would be a good thing
anyway; it's hard to argue that that's much better than a FATAL.

Well a session which is usable after a transaction abort is quite sensible -
quite some software I know handles a failing transaction much more gracefully
than a session abort (e.g. because it has to deal with serialization failures
and such anyway).

So making it cought by fewer places and degrading to FATAL sound sensible and
relatively easy to me.
Unless somebody disagrees I will give it a shot.

Alternatively the current state is available at:
http://git.postgresql.org/gitweb?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/hs-query-cancel

Its a bit raw around the edges, but I wanted to get some feedback...

Andres

#43Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#41)
Re: Hot Standy introduced problem with query cancel behavior

On Tue, 2010-01-12 at 06:30 +0100, Andres Freund wrote:

On 01/07/10 22:37, Andres Freund wrote:

On Thursday 07 January 2010 22:28:46 Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.
should have read as "revert to old code and add two step killing (thats
likely around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if
nothing happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

Ah. This loop happens in the process that's trying to send the cancel
signal, correct, not the one that needs to respond to it? That sounds
fairly sane to me.

Yes.

There are some things we could do to make it more likely that a cancel
of this type is accepted --- for instance, give it a distinct SQLSTATE
code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
is no practical way to guarantee it except elog(FATAL). I'm not
entirely convinced that an untrappable error would be a good thing
anyway; it's hard to argue that that's much better than a FATAL.

Well a session which is usable after a transaction abort is quite sensible -
quite some software I know handles a failing transaction much more gracefully
than a session abort (e.g. because it has to deal with serialization failures
and such anyway).

So making it cought by fewer places and degrading to FATAL sound sensible and
relatively easy to me.
Unless somebody disagrees I will give it a shot.

Ok, here is a stab at that:

1. Allow the signal multiplexing facility to transfer one sig_atomic_t
worth of data. This is usefull e.g. for making operations idempotent
or more precise.

In this the LocalBackendId is transported - that should make it
impossible to cancel the wrong transaction

This doesn't remove the possibility of cancelling the wrong transaction,
it just reduces the chances. You can still cancel a session with
LocalBackendId == 1 and then have it accidentally cancel the next
session with the same backendid. That's why I didn't chase this idea
myself earlier.

Closing that loophole isn't a priority and is best left until we have
the other things have been cleaned up.

2.

AbortTransactionAndAnySubtransactions is only used in the mainloops
error handler as it should be unproblematic there.

In the current CVS code ConditionalVirtualXactLockTableWait() in
ResolveRecoveryConflictWithVirtualXIDs does the wait for every try of
cancelling the other transaction.

I moved the retry logic into CancelVirtualTransaction(). If 50 times a
ERROR does nothing it degrades to FATAL

It's a good idea but not the right one, IMHO. I'd rather that the
backend decided whether the instruction results in an ERROR or a FATAL
based upon its own state, rather than have Startup process bang its head
against the wall 50 times without knowing why.

XXX: I temporarily do not use the skipExistingConflicts argument of
GetConflictingVirtualXIDs - I dont understand its purpose and a bit of
infrastructure is missing right now as the recoveryConflictMode is not
stored anymore anywhere. Can easily be added back though.

3.
Add a new error code ERRCODE_QUERY_CANCELED_HS for use with HS
indicating a failure that is more than a plain
ERRCODE_QUERY_CANCELED - namely it should not be caught from
various places like savepoints and in PLs.

Exemplarily I checked for that error code in plpgsql and make it
uncatcheable.

I am not sure how new errorcode codes get chosen though - and the name
is not that nice.

Opinions on that?

I don't trust it yet, as a mechanism. Changing only PL/pgSQL seems
fairly broken also. Again, this seems like something that be handled
separately.

I copied quite a bit from Simons earlier patch...

It changes a few things around and adds the ideas you've mentioned,
though seeing them as code doesn't in itself move the discussion
forwards. There are far too many new ideas in this patch for it to be
even close to acceptable, to me. Yes, lets discuss these additional
ideas, but after a basic patch has been applied.

I do value your interest and input, though racing me to rework my own
patch, then asking me to review it seems like wasted time for both of
us. I will post a new patch on ~Friday.

(Also, please make your braces { } follow the normal coding conventions
for Postgres.)

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Seems fairly important piece.

--
Simon Riggs www.2ndQuadrant.com

#44Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#43)
Re: Hot Standy introduced problem with query cancel behavior

On 01/12/10 09:40, Simon Riggs wrote:

On Tue, 2010-01-12 at 06:30 +0100, Andres Freund wrote:

On 01/07/10 22:37, Andres Freund wrote:

On Thursday 07 January 2010 22:28:46 Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

I did not want to suggest using Simons code there. Sorry for the brevity.
should have read as "revert to old code and add two step killing (thats
likely around 10 lines or so)".

"two step killing" meaning that we signal ERROR for a few times and if
nothing happens that we like, we signal FATAL.
As the code already loops around signaling anyway that should be easy to
implement.

Ah. This loop happens in the process that's trying to send the cancel
signal, correct, not the one that needs to respond to it? That sounds
fairly sane to me.

Yes.

There are some things we could do to make it more likely that a cancel
of this type is accepted --- for instance, give it a distinct SQLSTATE
code that *can not* be trapped by plpgsql EXCEPTION blocks --- but there
is no practical way to guarantee it except elog(FATAL). I'm not
entirely convinced that an untrappable error would be a good thing
anyway; it's hard to argue that that's much better than a FATAL.

Well a session which is usable after a transaction abort is quite sensible -
quite some software I know handles a failing transaction much more gracefully
than a session abort (e.g. because it has to deal with serialization failures
and such anyway).

So making it cought by fewer places and degrading to FATAL sound sensible and
relatively easy to me.
Unless somebody disagrees I will give it a shot.

Ok, here is a stab at that:

1. Allow the signal multiplexing facility to transfer one sig_atomic_t
worth of data. This is usefull e.g. for making operations idempotent
or more precise.

In this the LocalBackendId is transported - that should make it
impossible to cancel the wrong transaction

This doesn't remove the possibility of cancelling the wrong transaction,
it just reduces the chances. You can still cancel a session with
LocalBackendId == 1 and then have it accidentally cancel the next
session with the same backendid. That's why I didn't chase this idea
myself earlier.

Hm. I don't think so. Backend Initialization clears those flags. And the
signal is sent to a specific pid so you cant send it to a new backend
when targeting the old one.

So how should that occur?

Closing that loophole isn't a priority and is best left until we have
the other things have been cleaned up.

Yes, maybe. It was just rather easy to fix and fixed the problem
sufficiently so that I could not reproduce it in contrast to seeing the
problem during a regular testrun.

2.

AbortTransactionAndAnySubtransactions is only used in the mainloops
error handler as it should be unproblematic there.

In the current CVS code ConditionalVirtualXactLockTableWait() in
ResolveRecoveryConflictWithVirtualXIDs does the wait for every try of
cancelling the other transaction.

I moved the retry logic into CancelVirtualTransaction(). If 50 times a
ERROR does nothing it degrades to FATAL

It's a good idea but not the right one, IMHO. I'd rather that the
backend decided whether the instruction results in an ERROR or a FATAL
based upon its own state, rather than have Startup process bang its head
against the wall 50 times without knowing why.

I don't think its that easy to keep enough state there in a safe manner.
Also the concept of retrying is not new - I just made it degrade and not
rewait for max_standby_delay (which imho is a bug).

In many cases the retries and repeated ERRORs will be enough to free the
backend out of all PG_TRY/CATCH blocks that the next ERROR reaches the
mainloop. So the retrying is quite sensible - and you cant do that part
in the signalled backend itself.

XXX: I temporarily do not use the skipExistingConflicts argument of
GetConflictingVirtualXIDs - I dont understand its purpose and a bit of
infrastructure is missing right now as the recoveryConflictMode is not
stored anymore anywhere. Can easily be added back though.

Is that a leftover from additional capabilities (deferred conflict
handling?)? Because currently there will be never a case with two
different cancellations at the same time. Also the current logic throws
away a FATAL error if a ERROR is already there.

3.
Add a new error code ERRCODE_QUERY_CANCELED_HS for use with HS
indicating a failure that is more than a plain
ERRCODE_QUERY_CANCELED - namely it should not be caught from
various places like savepoints and in PLs.

Exemplarily I checked for that error code in plpgsql and make it
uncatcheable.

I am not sure how new errorcode codes get chosen though - and the name
is not that nice.

Opinions on that?

I don't trust it yet, as a mechanism. Changing only PL/pgSQL seems
fairly broken also. Again, this seems like something that be handled
separately.

Well, that was an exemplary change - its easy to fix that at other
places as well. Without any identifier of such an abort I don't see how
that could work.

We simply cant break out of nested transactions if were outside the
mainloop.

I copied quite a bit from Simons earlier patch...

It changes a few things around and adds the ideas you've mentioned,
though seeing them as code doesn't in itself move the discussion
forwards. There are far too many new ideas in this patch for it to be
even close to acceptable, to me. Yes, lets discuss these additional
ideas, but after a basic patch has been applied.

Sure, the "targeted" signalling part can be left of, yes. But the rest
is mostly necessary I think.

I do value your interest and input, though racing me to rework my own
patch, then asking me to review it seems like wasted time for both of
us. I will post a new patch on ~Friday.

Well. I explicitly asked whether you or somebody else is going after
this. Waited two days and only then started working on it.
And you seem to have enough on your plate...

(Also, please make your braces { } follow the normal coding conventions
for Postgres.)

Sure.

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR& ERROR_NO_SEND_CLIENT, .. or such?

Seems fairly important piece.

Its quite a bit of manual work though so I wanted to be sure thats an
agreeable approach.

Andres

#45Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#43)
Re: Hot Standy introduced problem with query cancel behavior

On Tuesday 12 January 2010 09:40:03 Simon Riggs wrote:

On Tue, 2010-01-12 at 06:30 +0100, Andres Freund wrote:

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Seems fairly important piece.

Do you aggree on the approach then? Do you want to do it?

Andres

#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#45)
Re: Hot Standy introduced problem with query cancel behavior

Andres Freund <andres@anarazel.de> writes:

On Tuesday 12 January 2010 09:40:03 Simon Riggs wrote:

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Do you aggree on the approach then? Do you want to do it?

Why not use the already-existing COMMERROR thing?

regards, tom lane

#47Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#46)
Re: Hot Standy introduced problem with query cancel behavior

Hi,

The thought was that it might be helpful to be able to raise NOTICE or DEBUG* as well - but admitedly there is not a big need for it...

Andres
--
Sent from a mobile phone - please excuse brevity and formatting.
----- Ursprüngliche Mitteilung -----

Show quoted text

Andres Freund <andres@anarazel.de> writes:

On Tuesday 12 January 2010 09:40:03 Simon Riggs wrote:

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Do you aggree on the approach then? Do you want to do it?

Why not use the already-existing COMMERROR thing?

            regards, tom lane

#48Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#45)
Re: Hot Standy introduced problem with query cancel behavior

On Tue, 2010-01-12 at 19:43 +0100, Andres Freund wrote:

On Tuesday 12 January 2010 09:40:03 Simon Riggs wrote:

On Tue, 2010-01-12 at 06:30 +0100, Andres Freund wrote:

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or such?

Seems fairly important piece.

Do you aggree on the approach then? Do you want to do it?

If you would like to prototype something on this issue it would be
gratefully received. I will review when submitted, though I may need
other review also.

I'm still reworking other code, so things might change under you, though
not deliberately so. I will post as soon as I can, which isn't yet.

--
Simon Riggs www.2ndQuadrant.com

#49Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#48)
Re: Hot Standy introduced problem with query cancel behavior

On Wednesday 13 January 2010 00:07:53 Simon Riggs wrote:

On Tue, 2010-01-12 at 19:43 +0100, Andres Freund wrote:

On Tuesday 12 January 2010 09:40:03 Simon Riggs wrote:

On Tue, 2010-01-12 at 06:30 +0100, Andres Freund wrote:

Currently the patch does not yet do anything to avoid letting the
protocol out of sync. What do you think about adding a flag for error
codes not to communicate with the client (Similarly to COMERROR)?

So that one could do an elog(ERROR & ERROR_NO_SEND_CLIENT, .. or
such?

Seems fairly important piece.

Do you aggree on the approach then? Do you want to do it?

If you would like to prototype something on this issue it would be
gratefully received. I will review when submitted, though I may need
other review also.

Will do - likely not before Saturday though.

I'm still reworking other code, so things might change under you, though
not deliberately so. I will post as soon as I can, which isn't yet.

No problem. Readapting a relatively minor amount of code isnt that hard.

Andres