Bug in walsender when calling out to do_pg_stop_backup (and others?)

Started by Magnus Haganderover 14 years ago19 messages
#1Magnus Hagander
magnus@hagander.net
1 attachment(s)

When walsender calls out to do_pg_stop_backup() (during base backups),
it is not possible to terminate the process with a SIGTERM - it
requires a SIGKILL. This can leave unkillable backends for example if
archive_mode is on and archive_command is failing (or not set). A
similar thing would happen in other cases if walsender calls out to
something that would block (do_pg_start_backup() for example), but the
stop one is easy to provoke.

ISTM one way to fix it is the attached, which is to have walsender set
the "global" flags saying that we have received sigterm, which in turn
causes the CHECK_FOR_INTERRUPTS() calls in the routines to properly
exit the process. AFAICT it works fine. Any holes in this approach?

Second, I wonder if we should add a SIGINT handler as well, that would
make it possible to send a cancel signal. Given that the end result
would be the same (at least if we want to keep with the "walsender is
simple" path), I'm not sure it's necessary - but it would at least
help those doing pg_cancel_backend()... thoughts?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

walsender_sigterm.patchtext/x-patch; charset=US-ASCII; name=walsender_sigterm.patchDownload
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5d1c518..c8fd165 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1302,6 +1302,13 @@ WalSndShutdownHandler(SIGNAL_ARGS)
 	if (MyWalSnd)
 		SetLatch(&MyWalSnd->latch);
 
+	/*
+	 * Set the standard (non-walsender) state as well, so that we can
+	 * abort things like do_pg_stop_backup().
+	 */
+	InterruptPending = true;
+	ProcDiePending = true;
+
 	errno = save_errno;
 }
 
#2Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#1)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Wed, Oct 5, 2011 at 10:30 PM, Magnus Hagander <magnus@hagander.net> wrote:

When walsender calls out to do_pg_stop_backup() (during base backups),
it is not possible to terminate the process with a SIGTERM - it
requires a SIGKILL. This can leave unkillable backends for example if
archive_mode is on and archive_command is failing (or not set). A
similar thing would happen in other cases if walsender calls out to
something that would block (do_pg_start_backup() for example), but the
stop one is easy to provoke.

Good catch!

ISTM one way to fix it is the attached, which is to have walsender set
the "global" flags saying that we have received sigterm, which in turn
causes the CHECK_FOR_INTERRUPTS() calls in the routines to properly
exit the process. AFAICT it works fine. Any holes in this approach?

Very simple patch. Looks fine.

Second, I wonder if we should add a SIGINT handler as well, that would
make it possible to send a cancel signal. Given that the end result
would be the same (at least if we want to keep with the "walsender is
simple" path), I'm not sure it's necessary - but it would at least
help those doing pg_cancel_backend()... thoughts?

I don't think that's necessary because, as you suggested, there is no
use case for *now*. We can add that handler when someone proposes
the feature which requires that.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#3Florian Pflug
fgp@phlo.org
In reply to: Magnus Hagander (#1)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Oct5, 2011, at 15:30 , Magnus Hagander wrote:

When walsender calls out to do_pg_stop_backup() (during base backups),
it is not possible to terminate the process with a SIGTERM - it
requires a SIGKILL. This can leave unkillable backends for example if
archive_mode is on and archive_command is failing (or not set). A
similar thing would happen in other cases if walsender calls out to
something that would block (do_pg_start_backup() for example), but the
stop one is easy to provoke.

Hm, this seems to be related to another buglet I noticed a while ago,
but then forgot about again. If one terminates pg_basebackup while it's
waiting for all required WAL to be archived, the backend process only
exits once that waiting phase is over. If, like in your failure case,
archive_command fails indefinity (or isn't set), the backend process
stays around forever.

Your patch would improve that only insofar as it'd at least allow an
immediate shutdown request to succeed - as it stands, that doesn't work
because, as you mentioned, the blocked walsender doesn't handle SIGTERM.

The question is, should we do more? To me, it'd make sense to terminate
a backend once it's connection is gone. We could, for example, make
pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a
broken connection that same way as a SIGINT or SIGTERM.

Thoughts?

ISTM one way to fix it is the attached, which is to have walsender set
the "global" flags saying that we have received sigterm, which in turn
causes the CHECK_FOR_INTERRUPTS() calls in the routines to properly
exit the process. AFAICT it works fine. Any holes in this approach?

Seems sensible, but my knowledge about these code paths is quite limited..

Second, I wonder if we should add a SIGINT handler as well, that would
make it possible to send a cancel signal. Given that the end result
would be the same (at least if we want to keep with the "walsender is
simple" path), I'm not sure it's necessary - but it would at least
help those doing pg_cancel_backend()... thoughts?

If all that's needed is a simple SIGINT handler that sets one flag, it'd
say let's add it.

best regards,
Florian Pflug

#4Magnus Hagander
magnus@hagander.net
In reply to: Florian Pflug (#3)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Thu, Oct 6, 2011 at 14:34, Florian Pflug <fgp@phlo.org> wrote:

On Oct5, 2011, at 15:30 , Magnus Hagander wrote:

When walsender calls out to do_pg_stop_backup() (during base backups),
it is not possible to terminate the process with a SIGTERM - it
requires a SIGKILL. This can leave unkillable backends for example if
archive_mode is on and archive_command is failing (or not set). A
similar thing would happen in other cases if walsender calls out to
something that would block (do_pg_start_backup() for example), but the
stop one is easy to provoke.

Hm, this seems to be related to another buglet I noticed a while ago,
but then forgot about again. If one terminates pg_basebackup while it's
waiting for all required WAL to be archived, the backend process only
exits once that waiting phase is over. If, like in your failure case,
archive_command fails indefinity (or isn't set), the backend process
stays around forever.

Yes.

Your patch would improve that only insofar as it'd at least allow an
immediate shutdown request to succeed - as it stands, that doesn't work
because, as you mentioned, the blocked walsender doesn't handle SIGTERM.

Exactly.

The question is, should we do more? To me, it'd make sense to terminate
a backend once it's connection is gone. We could, for example, make
pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a
broken connection that same way as a SIGINT or SIGTERM.

The problem here is that we're hanging at a place where we don't touch
the socket. So we won't notice the socket is gone. We'd have to do a
select() or something like that at regular intervals to make sure it's
there, no?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#5Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#2)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Thu, Oct 6, 2011 at 04:22, Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Oct 5, 2011 at 10:30 PM, Magnus Hagander <magnus@hagander.net> wrote:

When walsender calls out to do_pg_stop_backup() (during base backups),
it is not possible to terminate the process with a SIGTERM - it
requires a SIGKILL. This can leave unkillable backends for example if
archive_mode is on and archive_command is failing (or not set). A
similar thing would happen in other cases if walsender calls out to
something that would block (do_pg_start_backup() for example), but the
stop one is easy to provoke.

Good catch!

ISTM one way to fix it is the attached, which is to have walsender set
the "global" flags saying that we have received sigterm, which in turn
causes the CHECK_FOR_INTERRUPTS() calls in the routines to properly
exit the process. AFAICT it works fine. Any holes in this approach?

Very simple patch. Looks fine.

Ok, thanks. Applied.

Second, I wonder if we should add a SIGINT handler as well, that would
make it possible to send a cancel signal. Given that the end result
would be the same (at least if we want to keep with the "walsender is
simple" path), I'm not sure it's necessary - but it would at least
help those doing pg_cancel_backend()... thoughts?

I don't think that's necessary because, as you suggested, there is no
use case for *now*. We can add that handler when someone proposes
the feature which requires that.

Yeah. It's certainly not something backpatch-worthy.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#6Florian Pflug
fgp@phlo.org
In reply to: Magnus Hagander (#4)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Oct6, 2011, at 21:48 , Magnus Hagander wrote:

The question is, should we do more? To me, it'd make sense to terminate
a backend once it's connection is gone. We could, for example, make
pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a
broken connection that same way as a SIGINT or SIGTERM.

The problem here is that we're hanging at a place where we don't touch
the socket. So we won't notice the socket is gone. We'd have to do a
select() or something like that at regular intervals to make sure it's
there, no?

We do emit NOTICEs saying "pg_stop_backup still waiting ... " repeatedly,
so we should notice a dead connection sooner or later. When I tried, I even
got a message in the log complaining about the "broken pipe".

As it stands, the interval between two NOTICEs grows exponentially - we
send the first after waiting 5 second, the next after waiting 60 seconds,
and then after waiting 120, 240, 480, ... seconds. This means that that the
backend would in the worst case linger the same amount of time *after*
pg_basebackup was cancelled that pg_basebackup waited for *before* it was
cancelled.

It'd be nice to generally terminate a backend if the client vanishes, but so
far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
sends a signal *everytime* the fd becomes readable or writeable, not only on
EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
make the postmaster keep the fd's of around even after forking a backend, and
make it watch for broken connections using select(). But with a large max_backends
settings, we'd risk running out of fds in the postmaster...

best regards,
Florian Pflug

#7Magnus Hagander
magnus@hagander.net
In reply to: Florian Pflug (#6)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Thu, Oct 6, 2011 at 23:46, Florian Pflug <fgp@phlo.org> wrote:

On Oct6, 2011, at 21:48 , Magnus Hagander wrote:

The question is, should we do more? To me, it'd make sense to terminate
a backend once it's connection is gone. We could, for example, make
pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a
broken connection that same way as a SIGINT or SIGTERM.

The problem here is that we're hanging at a place where we don't touch
the socket. So we won't notice the socket is gone. We'd have to do a
select() or something like that at regular intervals to make sure it's
there, no?

We do emit NOTICEs saying "pg_stop_backup still waiting ... " repeatedly,
so we should notice a dead connection sooner or later. When I tried, I even
got a message in the log complaining about the "broken pipe".

Ah, good point, that should be doable. Forgot about that message...

As it stands, the interval between two NOTICEs grows exponentially - we
send the first after waiting 5 second, the next after waiting 60 seconds,
and then after waiting 120, 240, 480, ... seconds. This means that that the
backend would in the worst case linger the same amount of time *after*
pg_basebackup was cancelled that pg_basebackup waited for *before* it was
cancelled.

It'd be nice to generally terminate a backend if the client vanishes, but so
far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
sends a signal *everytime* the fd becomes readable or writeable, not only on
EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
make the postmaster keep the fd's of around even after forking a backend, and
make it watch for broken connections using select(). But with a large max_backends
settings, we'd risk running out of fds in the postmaster...

Ugh. Yeah. But at least catching it and terminating it when we *do*
notice it's down would certainly make sense...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#8Florian Pflug
fgp@phlo.org
In reply to: Magnus Hagander (#7)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Oct10, 2011, at 21:25 , Magnus Hagander wrote:

On Thu, Oct 6, 2011 at 23:46, Florian Pflug <fgp@phlo.org> wrote:

It'd be nice to generally terminate a backend if the client vanishes, but so
far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
sends a signal *everytime* the fd becomes readable or writeable, not only on
EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
make the postmaster keep the fd's of around even after forking a backend, and
make it watch for broken connections using select(). But with a large max_backends
settings, we'd risk running out of fds in the postmaster...

Ugh. Yeah. But at least catching it and terminating it when we *do*
notice it's down would certainly make sense...

I'll try to put together a patch that sets a flag if we discover a broken
connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless you
wanna, of course.

best regards,
Florian Pflug

#9Magnus Hagander
magnus@hagander.net
In reply to: Florian Pflug (#8)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Tue, Oct 11, 2011 at 03:29, Florian Pflug <fgp@phlo.org> wrote:

On Oct10, 2011, at 21:25 , Magnus Hagander wrote:

On Thu, Oct 6, 2011 at 23:46, Florian Pflug <fgp@phlo.org> wrote:

It'd be nice to generally terminate a backend if the client vanishes, but so
far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
sends a signal *everytime* the fd becomes readable or writeable, not only on
EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
make the postmaster keep the fd's of around even after forking a backend, and
make it watch for broken connections using select(). But with a large max_backends
settings, we'd risk running out of fds in the postmaster...

Ugh. Yeah. But at least catching it and terminating it when we *do*
notice it's down would certainly make sense...

I'll try to put together a patch that sets a flag if we discover a broken
connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless you
wanna, of course.

Please do, I won't have time to even think about it until after
pgconf.eu anyway ;)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#10Florian Pflug
fgp@phlo.org
In reply to: Magnus Hagander (#9)
1 attachment(s)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Oct11, 2011, at 09:21 , Magnus Hagander wrote:

On Tue, Oct 11, 2011 at 03:29, Florian Pflug <fgp@phlo.org> wrote:

On Oct10, 2011, at 21:25 , Magnus Hagander wrote:

On Thu, Oct 6, 2011 at 23:46, Florian Pflug <fgp@phlo.org> wrote:

It'd be nice to generally terminate a backend if the client vanishes, but so
far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
sends a signal *everytime* the fd becomes readable or writeable, not only on
EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We could
make the postmaster keep the fd's of around even after forking a backend, and
make it watch for broken connections using select(). But with a large max_backends
settings, we'd risk running out of fds in the postmaster...

Ugh. Yeah. But at least catching it and terminating it when we *do*
notice it's down would certainly make sense...

I'll try to put together a patch that sets a flag if we discover a broken
connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless you
wanna, of course.

Please do, I won't have time to even think about it until after
pgconf.eu anyway ;)

Ok, here's a first cut.

I've based this on how query cancellation due to recovery conflicts work -
internal_flush() sets QueryCancelPending and ClientConnectionLostPending.

If QueryCancelPending is set, CHECK_FOR_INTERRUPTS checks
ClientConnectionLostPending, and if it's set it does ereport(FATAL).

I've only done light testing so far - basically the only case I've tested is
killing pg_basebackup while it's waiting for all required WAL to be archived.

best regards,
Florian Pflug

Attachments:

pg.discon_cancel.v1.patchapplication/octet-stream; name=pg.discon_cancel.v1.patch; x-unix-mode=0644Download
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index b83a2ef..2b114e0 100644
*** a/src/backend/libpq/pqcomm.c
--- b/src/backend/libpq/pqcomm.c
*************** internal_flush(void)
*** 1247,1255 ****
  
  			/*
  			 * We drop the buffered data anyway so that processing can
! 			 * continue, even though we'll probably quit soon.
  			 */
  			PqSendStart = PqSendPointer = 0;
  			return EOF;
  		}
  
--- 1247,1260 ----
  
  			/*
  			 * We drop the buffered data anyway so that processing can
! 			 * continue, even though we'll probably quit soon. We also
! 			 * set a flag that'll cause the next CHECK_FOR_INTERRUPTS
! 			 * to cancel the current query.
  			 */
  			PqSendStart = PqSendPointer = 0;
+ 			InterruptPending = 1;
+ 			QueryCancelPending = 1;
+ 			ClientConnectionLostPending = 1;
  			return EOF;
  		}
  
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 976a832..5e53c24 100644
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
*************** ProcessInterrupts(void)
*** 2876,2881 ****
--- 2876,2891 ----
  				 errmsg("canceling statement due to conflict with recovery"),
  						 errdetail_recovery_conflict()));
  		}
+ 		if (ClientConnectionLostPending)
+ 		{
+ 			ImmediateInterruptOK = false;		/* not idle anymore */
+ 			ClientConnectionLostPending = false;
+ 			DisableNotifyInterrupt();
+ 			DisableCatchupInterrupt();
+ 			ereport(FATAL,
+ 					(errcode(ERRCODE_QUERY_CANCELED),
+ 					 errmsg("canceling statement due to client disconnect")));
+ 		}
  
  		/*
  		 * If we are reading a command from the client, just ignore the cancel
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index c4c4154..a8201ed 100644
*** a/src/backend/utils/init/globals.c
--- b/src/backend/utils/init/globals.c
*************** ProtocolVersion FrontendProtocol;
*** 29,34 ****
--- 29,35 ----
  volatile bool InterruptPending = false;
  volatile bool QueryCancelPending = false;
  volatile bool ProcDiePending = false;
+ bool ClientConnectionLostPending = false;
  volatile bool ImmediateInterruptOK = false;
  volatile uint32 InterruptHoldoffCount = 0;
  volatile uint32 CritSectionCount = 0;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 9d19417..78ebbe3 100644
*** a/src/include/miscadmin.h
--- b/src/include/miscadmin.h
*************** extern PGDLLIMPORT volatile bool Interru
*** 70,75 ****
--- 70,77 ----
  extern volatile bool QueryCancelPending;
  extern volatile bool ProcDiePending;
  
+ extern bool ClientConnectionLostPending;
+ 
  /* these are marked volatile because they are examined by signal handlers: */
  extern volatile bool ImmediateInterruptOK;
  extern PGDLLIMPORT volatile uint32 InterruptHoldoffCount;
#11Greg Jaskiewicz
gj@pointblue.com.pl
In reply to: Florian Pflug (#10)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On 15 Oct 2011, at 11:31, Florian Pflug wrote:

Ok, here's a first cut.

So I looked at the patch, and first thing that pops out,
is lack of the volatile keyword before the ClientConnectionLostPending variable is defined. Is that done on purpose ? Is that on purpose ?

Otherwise the patch itself looks ok.
I haven't tested the code, just reviewed the patch itself. And it obviously needs testing, should be easy to follow your original problem description.

Btw, I just tried to do it through commitfest.postgresql.org , but before I get my head around on how to add myself to the reviewer list there - I thought I'll just send this response here.

#12Florian Pflug
fgp@phlo.org
In reply to: Greg Jaskiewicz (#11)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Oct19, 2011, at 17:47 , Greg Jaskiewicz wrote:

On 15 Oct 2011, at 11:31, Florian Pflug wrote:

Ok, here's a first cut.

So I looked at the patch, and first thing that pops out,
is lack of the volatile keyword before the ClientConnectionLostPending variable is defined. Is that done on purpose ? Is that on purpose ?

That's on purpose. volatile is only necessary for variables which are either accessed from within signal handlers or which live in shared memory. Neither is true for ClientConnectionLostPending, so non-volatile should be fine.

Otherwise the patch itself looks ok.
I haven't tested the code, just reviewed the patch itself. And it obviously needs testing, should be easy to follow your original problem description.

Yeah, further testing is on my todo list. The interesting case is probably what happens if the connection is dropped while there's already a cancellation request pending. And also the other way around - a cancellation request arriving after we've already discovered that the connection is gone.

Btw, I just tried to do it through commitfest.postgresql.org , but before I get my head around on how to add myself to the reviewer list there - I thought I'll just send this response here.

You just need to click "Edit Patch" and put your name into the Reviewer field. You do need a postgres community account to edit patches, but the signup process is quite quick and painless AFAIR.

best regards,
Florian Pflug

#13Greg Jaskiewicz
gj@pointblue.com.pl
In reply to: Florian Pflug (#12)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On 19 Oct 2011, at 17:54, Florian Pflug wrote:

On Oct19, 2011, at 17:47 , Greg Jaskiewicz wrote:

On 15 Oct 2011, at 11:31, Florian Pflug wrote:

Ok, here's a first cut.

So I looked at the patch, and first thing that pops out,
is lack of the volatile keyword before the ClientConnectionLostPending variable is defined. Is that done on purpose ? Is that on purpose ?

That's on purpose. volatile is only necessary for variables which are either accessed from within signal handlers or which live in shared memory. Neither is true for ClientConnectionLostPending, so non-volatile should be fine.

Ok, cool.
I'm aware of the reasons behind volatile, just noticed that some other flags used in similar way are marked as such. At the end of the day, this is just a hint to the compiler anyway.

Otherwise the patch itself looks ok.
I haven't tested the code, just reviewed the patch itself. And it obviously needs testing, should be easy to follow your original problem description.

Yeah, further testing is on my todo list. The interesting case is probably what happens if the connection is dropped while there's already a cancellation request pending. And also the other way around - a cancellation request arriving after we've already discovered that the connection is gone.

Btw, I just tried to do it through commitfest.postgresql.org , but before I get my head around on how to add myself to the reviewer list there - I thought I'll just send this response here.

You just need to click "Edit Patch" and put your name into the Reviewer field. You do need a postgres community account to edit patches, but the signup process is quite quick and painless AFAIR.

Ok, clicking 'edit patch' sounds a bit big. Probably better would be, to just be able to click on some sort of "I'm in" button/checkbox type of thing.

#14Florian Pflug
fgp@phlo.org
In reply to: Greg Jaskiewicz (#13)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On Oct19, 2011, at 18:05 , Greg Jaskiewicz wrote:

On 19 Oct 2011, at 17:54, Florian Pflug wrote:

On Oct19, 2011, at 17:47 , Greg Jaskiewicz wrote:

On 15 Oct 2011, at 11:31, Florian Pflug wrote:

Ok, here's a first cut.

So I looked at the patch, and first thing that pops out,
is lack of the volatile keyword before the ClientConnectionLostPending variable is defined. Is that done on purpose ? Is that on purpose ?

That's on purpose. volatile is only necessary for variables which are either accessed from within signal handlers or which live in shared memory. Neither is true for ClientConnectionLostPending, so non-volatile should be fine.

Ok, cool.
I'm aware of the reasons behind volatile, just noticed that some other flags used in similar way are marked as such. At the end of the day, this is just a hint to the compiler anyway.

All the other flags which indicate cancellation reasons are set from signal handers, I believe. We could of course mark as ClientConnectionLostPending as volatile just to be consistent. Not sure whether that's a good idea, or not. It might prevent a mistake should we ever add code to detect lost connections asynchronously (i.e., from somewhere else than pq_flush). And the cost is probably negligible, because CHECK_FOR_INTERRUPTS tests for InterruptPending before calling ProcessInterrupts(), so we only pay the cost of volatile if there's actually an interrupt pending. But I still think it's better to add qualifies such a volatile only when really necessary. A comment about why it *isn't* volatile is probably in order, though, so I'll add that in the next version of the patch.

best regards,
Florian Pflug

PS: Thanks for the review. It's very much appreciated!

#15Greg Jaskiewicz
gj@pointblue.com.pl
In reply to: Florian Pflug (#14)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On 19 Oct 2011, at 18:28, Florian Pflug wrote:

All the other flags which indicate cancellation reasons are set from signal handers, I believe. We could of course mark as ClientConnectionLostPending as volatile just to be consistent. Not sure whether that's a good idea, or not. It might prevent a mistake should we ever add code to detect lost connections asynchronously (i.e., from somewhere else than pq_flush). And the cost is probably negligible, because CHECK_FOR_INTERRUPTS tests for InterruptPending before calling ProcessInterrupts(), so we only pay the cost of volatile if there's actually an interrupt pending. But I still think it's better to add qualifies such a volatile only when really necessary. A comment about why it *isn't* volatile is probably in order, though, so I'll add that in the next version of the patch.

Makes sense.

I had to ask, because it sticks out. And indeed there is a possibility that someone will one day will try to use from signal handler, etc.

best regards,
Florian Pflug

PS: Thanks for the review. It's very much appreciated!

No problem, Got inspired by the talk I attended here at the pgconf.eu. Seems like a good idea to do these things, I have years of experience as developer and doing (relatively) well thanks to PostgreSQL - makes sense to contribute something back.

#16Bruce Momjian
bruce@momjian.us
In reply to: Greg Jaskiewicz (#15)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

Greg Jaskiewicz wrote:

On 19 Oct 2011, at 18:28, Florian Pflug wrote:

All the other flags which indicate cancellation reasons are set from signal handers, I believe. We could of course mark as ClientConnectionLostPending as volatile just to be consistent. Not sure whether that's a good idea, or not. It might prevent a mistake should we ever add code to detect lost connections asynchronously (i.e., from somewhere else than pq_flush). And the cost is probably negligible, because CHECK_FOR_INTERRUPTS tests for InterruptPending before calling ProcessInterrupts(), so we only pay the cost of volatile if there's actually an interrupt pending. But I still think it's better to add qualifies such a volatile only when really necessary. A comment about why it *isn't* volatile is probably in order, though, so I'll add that in the next version of the patch.

Makes sense.

I had to ask, because it sticks out. And indeed there is a possibility
that someone will one day will try to use from signal handler, etc.

A C comment might help there.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#17Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Greg Jaskiewicz (#15)
1 attachment(s)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On 19.10.2011 19:41, Greg Jaskiewicz wrote:

On 19 Oct 2011, at 18:28, Florian Pflug wrote:

All the other flags which indicate cancellation reasons are set from signal handers, I believe. We could of course mark as ClientConnectionLostPending as volatile just to be consistent. Not sure whether that's a good idea, or not. It might prevent a mistake should we ever add code to detect lost connections asynchronously (i.e., from somewhere else than pq_flush). And the cost is probably negligible, because CHECK_FOR_INTERRUPTS tests for InterruptPending before calling ProcessInterrupts(), so we only pay the cost of volatile if there's actually an interrupt pending. But I still think it's better to add qualifies such a volatile only when really necessary. A comment about why it *isn't* volatile is probably in order, though, so I'll add that in the next version of the patch.

Makes sense.

I had to ask, because it sticks out. And indeed there is a possibility that someone will one day will try to use from signal handler, etc.

Let's just mark it as volatile. It's not clear to me that we'll never
set or read the flag while in a signal handler. We probably don't, but
what if ImmediateInterruptOK is 'true', for example, just when we fail
to send a response and try to set the flags. In that case, we have to be
careful that ClientConnectionLost is set before InterruptPending (which
we can only guarantee if both are volatile, I believe), otherwise a
signal might arrive when we've just set InterruptPending, but not yet
ClientConnectionLost. ProcessInterrupts() would clear InterruptPending,
but not see ClientConnectionLost, so we would lose the interrupt.

That's not serious, and the window for that happening would be extremely
small, and I don't think it can actually happen as the code stands,
because we never try to send anything while ImmediateInterruptOK ==
true. But better safe than sorry.

One small change I'd like to make is to treat the loss of connection
more as a new "top-level" event, rather than as a new reason for query
cancellation. A lost connection is more like receiving SIGTERM, really.
Please take a look at the attached patch. I've been testing this by
doing "SELECT pg_sleep(1), a from generate_series(1,1000) a;", and
killing the connection with "killall -9 psql".

One remaining issue I have with this is the wording of the error
message. The closest existing message we have is in old-mode COPY:

ereport(FATAL,
(errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("connection lost during COPY to stdout")));

In the patch, I made the new message just "connection lost", but does
anyone else have a better opinion on that? Perhaps it should be
"connection lost while sending response to client". Or "connection lost
during execution of statement", but that might not be accurate if we're
not executing a statement at the moment, but just sending a "sync"
message or similar.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Attachments:

pg_discon_cancel.v1-heikki.patchtext/x-diff; name=pg_discon_cancel.v1-heikki.patchDownload
diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index b83a2ef..c36cf31 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -1247,9 +1247,13 @@ internal_flush(void)
 
 			/*
 			 * We drop the buffered data anyway so that processing can
-			 * continue, even though we'll probably quit soon.
+			 * continue, even though we'll probably quit soon. We also
+			 * set a flag that'll cause the next CHECK_FOR_INTERRUPTS
+			 * to terminate the connection.
 			 */
 			PqSendStart = PqSendPointer = 0;
+			ClientConnectionLost = 1;
+			InterruptPending = 1;
 			return EOF;
 		}
 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 976a832..13ea594 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2823,6 +2823,19 @@ ProcessInterrupts(void)
 					(errcode(ERRCODE_ADMIN_SHUTDOWN),
 			 errmsg("terminating connection due to administrator command")));
 	}
+	if (ClientConnectionLost)
+	{
+		QueryCancelPending = false;		/* lost connection trumps QueryCancel */
+		ImmediateInterruptOK = false;	/* not idle anymore */
+		DisableNotifyInterrupt();
+		DisableCatchupInterrupt();
+		/* As in quickdie, don't risk sending to client during auth */
+		if (ClientAuthInProgress && whereToSendOutput == DestRemote)
+			whereToSendOutput = DestNone;
+		ereport(FATAL,
+				(errcode(ERRCODE_CONNECTION_FAILURE),
+				 errmsg("connection lost")));
+	}
 	if (QueryCancelPending)
 	{
 		QueryCancelPending = false;
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 9ce64e6..9417c7a 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -29,6 +29,7 @@ ProtocolVersion FrontendProtocol;
 volatile bool InterruptPending = false;
 volatile bool QueryCancelPending = false;
 volatile bool ProcDiePending = false;
+volatile bool ClientConnectionLost = false;
 volatile bool ImmediateInterruptOK = false;
 volatile uint32 InterruptHoldoffCount = 0;
 volatile uint32 CritSectionCount = 0;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 4ee08fe..8048fc1 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -55,6 +55,12 @@
  * course, only if the interrupt holdoff counter is zero).	See the
  * related code for details.
  *
+ * A lost connection is handled similarly, although the ClientConnectionLost
+ * flag is not set in a signal handler, but whenever we get a nonrecoverable
+ * error writing to the socket. If there was a signal for a broken
+ * connection, we could make use of it by setting ClientConnectionLost in
+ * the signal handler.
+ *
  * A related, but conceptually distinct, mechanism is the "critical section"
  * mechanism.  A critical section not only holds off cancel/die interrupts,
  * but causes any ereport(ERROR) or ereport(FATAL) to become ereport(PANIC)
@@ -70,6 +76,8 @@ extern PGDLLIMPORT volatile bool InterruptPending;
 extern volatile bool QueryCancelPending;
 extern volatile bool ProcDiePending;
 
+extern volatile bool ClientConnectionLost;
+
 /* these are marked volatile because they are examined by signal handlers: */
 extern volatile bool ImmediateInterruptOK;
 extern PGDLLIMPORT volatile uint32 InterruptHoldoffCount;
#18Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#17)
Detecting loss of client connection (was Re: Bug in walsender when calling out to do_pg_stop_backup (and others?))

Thinking about this some more, why don't we just elog(FATAL) in
internal_flush(), if the write fails, instead of setting the flag and
waiting for the next CHECK_FOR_INTERRUPTS(). That sounds scary at first,
but why not?

There's this comment in internal_flush():

/*
* Careful: an ereport() that tries to write to the client would
* cause recursion to here, leading to stack overflow and core
* dump! This message must go *only* to the postmaster log.

That's understandable.

* If a client disconnects while we're in the midst of output, we
* might write quite a bit of data before we get to a safe query
* abort point. So, suppress duplicate log messages.

But what about this? Tracing back the callers, I don't see any that
would be upset if we just threw an error there. One scary aspect is if
you're within a critical section, but I don't think we currently send
any messages while in a critical section. And we could refrain from
throwing the error if we're in a critical section, to be safe.

*/
if (errno != last_reported_send_errno)
{
last_reported_send_errno = errno;
ereport(COMMERROR,
(errcode_for_socket_access(),
errmsg("could not send data to client: %m")));
}

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#19Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#17)
Re: Bug in walsender when calling out to do_pg_stop_backup (and others?)

On 03.12.2011 18:37, Heikki Linnakangas wrote:

One small change I'd like to make is to treat the loss of connection
more as a new "top-level" event, rather than as a new reason for query
cancellation. A lost connection is more like receiving SIGTERM, really.
Please take a look at the attached patch. I've been testing this by
doing "SELECT pg_sleep(1), a from generate_series(1,1000) a;", and
killing the connection with "killall -9 psql".

Ok, committed this.

One remaining issue I have with this is the wording of the error
message. The closest existing message we have is in old-mode COPY:

ereport(FATAL,
(errcode(ERRCODE_CONNECTION_FAILURE),
errmsg("connection lost during COPY to stdout")));

In the patch, I made the new message just "connection lost", but does
anyone else have a better opinion on that? Perhaps it should be
"connection lost while sending response to client". Or "connection lost
during execution of statement", but that might not be accurate if we're
not executing a statement at the moment, but just sending a "sync"
message or similar.

I made the error message "connection to client lost" in the end.

I still wonder if it would be safe to simply elog(FATAL) in
internal_flush(), but didn't dare to do that without more thorough analysis.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com