libpq and psql not on same page about SIGPIPE

Started by Tom Laneabout 21 years ago19 messages
#1Tom Lane
tgl@sss.pgh.pa.us

libpq compiled with --enable-thread-safety thinks it can set the SIGPIPE
signal handler. It thinks once is enough.

psql thinks it can arbitrarily flip the signal handler between SIG_IGN
and SIG_DFL. Ergo, after the first use of the pager for output, libpq's
SIGPIPE handling will be broken.

I submit that psql is unlikely to be the only program that does this,
and therefore that libpq must be considered broken, not psql.

regards, tom lane

#2Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#1)
Re: libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

libpq compiled with --enable-thread-safety thinks it can set the SIGPIPE
signal handler. It thinks once is enough.

psql thinks it can arbitrarily flip the signal handler between SIG_IGN
and SIG_DFL. Ergo, after the first use of the pager for output, libpq's
SIGPIPE handling will be broken.

I submit that psql is unlikely to be the only program that does this,
and therefore that libpq must be considered broken, not psql.

I have researched possible fixes for our threading sigpipe handling in
libpq. Basically, we need to ignore SIGPIPE in socket send() (and
SSL_write) because if the backend dies unexpectedly, the process will
die. libpq would rather trap the failure.

In 7.4.X we set ignore for SIGPIPE before write and reset it after
write, but that doesn't work for threading because it affects all
threads, not just the thread using libpq.

Our current setup is wrong because an application could change SIGPIPE
for its own purposes (like psql does) and remove our custom thread
handler for sigpipe.

The best solution seems to be one suggested by Manfred in November of
2003:

signal handlers are a process property, not a thread property - that
code is broken for multi-threaded apps.
At least that's how I understand the opengroup man page, and a quick
google confirmed that:
http://groups.google.de/groups?selm=353662BF.9D70F63A%40brighttiger.com

I haven't found a reliable thread-safe approach yet:
My first idea was block with pthread_sigmask, after send check if
pending with sigpending, and then delete with sigwait, and restore
blocked state. But that breaks if SIGPIPE is blocked and a signal is
already pending: there is no way to remove our additional SIGPIPE. I
don't see how we can avoid destroying the realtime signal info.

His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems
we could also check errno for SIGPIPE rather than calling sigpending.

He has a concern about an application that already blocked SIGPIPE and
has a pending SIGPIPE signal waiting already. One idea would be to
check for sigpending() before the send() and clear the signal only if
SIGPIPE wasn't pending before the call. I realize that if our send()
also generates a SIGPIPE it would remove the previous realtime signal
info but that seems a minor problem.

Comments? This seems like our only solution.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#3Manfred Spraul
manfred@colorfullife.com
In reply to: Bruce Momjian (#2)
Re: libpq and psql not on same page about SIGPIPE

Bruce Momjian wrote:

Comments? This seems like our only solution.

This would be a transparent solution. Another approach would be:
- Use the old 7.3 approach by default. This means perfect backward
compatibility for single-threaded apps and broken multithreaded apps.
- Add a new PQinitDB(int disableSigpipeHandler) initialization function.
Document that multithreaded apps must call the function with
disableSigpipeHandle=1 and handle SIGPIPE for libpq. Perhaps with a
reference implementation in libpq (i.e. a sigpipeMode with 0 for old
approach, 1 for do nothing, 2 for install our own handler).

It would prefer that approach:
It means that the multithreaded libpq apps must be updated [are there
any?], but the solution is simpler and less fragile than calling 4
signal handling function in a row to selectively block SIGPIPE per-thread.

--
Manfred

#4Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Manfred Spraul (#3)
Re: libpq and psql not on same page about SIGPIPE

Manfred Spraul wrote:

Bruce Momjian wrote:

Comments? This seems like our only solution.

This would be a transparent solution. Another approach would be:
- Use the old 7.3 approach by default. This means perfect backward
compatibility for single-threaded apps and broken multithreaded apps.
- Add a new PQinitDB(int disableSigpipeHandler) initialization function.
Document that multithreaded apps must call the function with
disableSigpipeHandle=1 and handle SIGPIPE for libpq. Perhaps with a
reference implementation in libpq (i.e. a sigpipeMode with 0 for old
approach, 1 for do nothing, 2 for install our own handler).

It would prefer that approach:
It means that the multithreaded libpq apps must be updated [are there
any?], but the solution is simpler and less fragile than calling 4
signal handling function in a row to selectively block SIGPIPE per-thread.

I think we can get away with three function calls because we can check
errno for EPIPE from the send() call. We already have two function
calls in there so I don't see a third as a huge problem and not worth an
API change unless someone tells us it is a performance hit.

One thing I know from the broken 7.4 code is that calling signal() from
inside a thread is very slow on Linux, like a 20% performance hit
because I assume it has to adjust all the threads signal stacks or
something. Anyway, I assume thread_sigmask() and sigpending() are not
huge hits.

In fact, by checking EPIPE from send() we don't need sigpending at all.
We just block and (if EPIPE) clear using sigwait(), then unblock. I
think we can document that if you are blocking SIGPIPE in your
application and wait to handle it later, you can't keep the pending
signal through a libpq call. I think that is a much easier requirement
than adding a new API call. The most common case of SIG_IGN/SIG_DFL is
not affected by this, only cases where they define their own signal
handler, block it, and then trying to keep a call active across a libpq
call.

Manfred, glad you are around to discuss this. After much research I
came up with a method and then found your description that matched it so
I felt I was on the right track.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Oliver Jowett
oliver@opencloud.com
In reply to: Bruce Momjian (#4)
Re: libpq and psql not on same page about SIGPIPE

Bruce Momjian wrote:

[... SIGPIPE suppression in libpq ...]

Linux also has MSG_NOSIGNAL as a send() flag that might be useful. It
suppresses generation of SIGPIPE for just that call. No, it doesn't work
for SSL and it's probably not very portable, but it might be a good
platform-specific optimization for the common case.

-O

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
Re: libpq and psql not on same page about SIGPIPE

Bruce Momjian <pgman@candle.pha.pa.us> writes:

His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems
we could also check errno for SIGPIPE rather than calling sigpending.

He has a concern about an application that already blocked SIGPIPE and
has a pending SIGPIPE signal waiting already. One idea would be to
check for sigpending() before the send() and clear the signal only if
SIGPIPE wasn't pending before the call. I realize that if our send()
also generates a SIGPIPE it would remove the previous realtime signal
info but that seems a minor problem.

Supposing that we don't touch the signal handler at all, then it is
possible that the application has set it to SIG_IGN, in which case a
SIGPIPE would be discarded rather than going into the pending mask.
So I think the logic has to be:

pthread_sigmask to block SIGPIPE and save existing signal mask

send();

if (errno == EPIPE)
{
if (sigpending indicates SIGPIPE pending)
use sigwait to clear SIGPIPE;
}

pthread_sigmask to restore prior signal mask

The only case that is problematic is where the application had
already blocked SIGPIPE and there is a pending SIGPIPE signal when
we are entered, *and* we get SIGPIPE ourselves.

If the C library does not support queued signals then our sigwait will
clear both our own EPIPE and the pending signal. This is annoying but
it doesn't seem fatal --- if the app is writing on a closed pipe then
it'll probably try it again later and get the signal again.

If the C library does support queued signals then we will read the
existing SIGPIPE condition and leave our own signal in the queue. This
is no problem to the extent that one pending SIGPIPE looks just like
another --- does anyone know of platforms where there is additional info
carried by a SIGPIPE event?

This seems workable as long as we document the possible gotchas.

regards, tom lane

#7Oliver Jowett
oliver@opencloud.com
In reply to: Tom Lane (#6)
Re: libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

If the C library does support queued signals then we will read the
existing SIGPIPE condition and leave our own signal in the queue. This
is no problem to the extent that one pending SIGPIPE looks just like
another --- does anyone know of platforms where there is additional info
carried by a SIGPIPE event?

POSIX.1b / SA_SIGINFO? SIGPIPE does not fill much of siginfo_t, but the
3rd handler arg has the interrupted execution context.

-O

#8Manfred Spraul
manfred@colorfullife.com
In reply to: Tom Lane (#6)
Re: libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems
we could also check errno for SIGPIPE rather than calling sigpending.

He has a concern about an application that already blocked SIGPIPE and
has a pending SIGPIPE signal waiting already. One idea would be to
check for sigpending() before the send() and clear the signal only if
SIGPIPE wasn't pending before the call. I realize that if our send()
also generates a SIGPIPE it would remove the previous realtime signal
info but that seems a minor problem.

Supposing that we don't touch the signal handler at all, then it is
possible that the application has set it to SIG_IGN, in which case a
SIGPIPE would be discarded rather than going into the pending mask.
So I think the logic has to be:

pthread_sigmask to block SIGPIPE and save existing signal mask

send();

if (errno == EPIPE)
{
if (sigpending indicates SIGPIPE pending)
use sigwait to clear SIGPIPE;
}

pthread_sigmask to restore prior signal mask

The only case that is problematic is where the application had
already blocked SIGPIPE and there is a pending SIGPIPE signal when
we are entered, *and* we get SIGPIPE ourselves.

If the C library does not support queued signals then our sigwait will
clear both our own EPIPE and the pending signal. This is annoying but
it doesn't seem fatal --- if the app is writing on a closed pipe then
it'll probably try it again later and get the signal again.

If the C library does support queued signals then we will read the
existing SIGPIPE condition and leave our own signal in the queue. This
is no problem to the extent that one pending SIGPIPE looks just like
another --- does anyone know of platforms where there is additional info
carried by a SIGPIPE event?

Linux stores pid/uid together with the signal. pid doesn't matter and no
sane programmer will look at the uid, so it seems to be possible.

This seems workable as long as we document the possible gotchas.

Is that really worthwhile? There are half a dozend assumption about the
C library and kernel internal efficiency of the signal handling
functions in the proposal. Adding a PQinitLib function is obviously a
larger change, but it solves the problem.
I'm aware of one minor gotcha: PQinSend() is not usable right now: it
relies on the initialization of pq_thread_in_send, which is only created
in the middle of the first connectDB(). That makes proper signal
handling for the first connection impossible.

--
Manfred

#9Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#6)
Re: libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems
we could also check errno for SIGPIPE rather than calling sigpending.

He has a concern about an application that already blocked SIGPIPE and
has a pending SIGPIPE signal waiting already. One idea would be to
check for sigpending() before the send() and clear the signal only if
SIGPIPE wasn't pending before the call. I realize that if our send()
also generates a SIGPIPE it would remove the previous realtime signal
info but that seems a minor problem.

Supposing that we don't touch the signal handler at all, then it is
possible that the application has set it to SIG_IGN, in which case a
SIGPIPE would be discarded rather than going into the pending mask.

Right.

So I think the logic has to be:

pthread_sigmask to block SIGPIPE and save existing signal mask

send();

if (errno == EPIPE)
{
if (sigpending indicates SIGPIPE pending)
use sigwait to clear SIGPIPE;
}

pthread_sigmask to restore prior signal mask

Yes, that is the logic in my patch, except that I don't check errno, I
just call sigpending(). There are a few writes and it seemed impossible
to check them all. Also, we might get a SIGPIPE but as you mentioned
the SIGPIPE may have been ignored.

The only case that is problematic is where the application had
already blocked SIGPIPE and there is a pending SIGPIPE signal when
we are entered, *and* we get SIGPIPE ourselves.

Right.

If the C library does not support queued signals then our sigwait will
clear both our own EPIPE and the pending signal. This is annoying but
it doesn't seem fatal --- if the app is writing on a closed pipe then
it'll probably try it again later and get the signal again.

Right.

If the C library does support queued signals then we will read the
existing SIGPIPE condition and leave our own signal in the queue. This
is no problem to the extent that one pending SIGPIPE looks just like
another --- does anyone know of platforms where there is additional info
carried by a SIGPIPE event?

Right. We could add a sigpending() check and if set we just don't clear
the SIGPIPE signal, but this adds an additional function call into the
mix that seemed unnecessary, though I could be convinced otherwise.

This seems workable as long as we document the possible gotchas.

My patch documents the limitation. I finished reading the
Addison-Wesley "Programming with POSIX Threads" book and it has helped
me with this stuff.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#10Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Manfred Spraul (#8)
Re: libpq and psql not on same page about SIGPIPE

Manfred Spraul wrote:

This seems workable as long as we document the possible gotchas.

Is that really worthwhile? There are half a dozend assumption about the
C library and kernel internal efficiency of the signal handling
functions in the proposal. Adding a PQinitLib function is obviously a

The main path uses pthread_sigmask() and sigpending(). Are those
possible performance problems? I see how signal() would be a thread
problem, but not those.

larger change, but it solves the problem.
I'm aware of one minor gotcha: PQinSend() is not usable right now: it
relies on the initialization of pq_thread_in_send, which is only created
in the middle of the first connectDB(). That makes proper signal
handling for the first connection impossible.

I think that whole PQinSend thing is pretty ugly, even if I did write
it. My current patch seems like a great improvement.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Manfred Spraul (#8)
Re: libpq and psql not on same page about SIGPIPE

Manfred Spraul <manfred@colorfullife.com> writes:

Is that really worthwhile? There are half a dozend assumption about the
C library and kernel internal efficiency of the signal handling
functions in the proposal. Adding a PQinitLib function is obviously a
larger change, but it solves the problem.

Not really: it only solves the problem *if you change the application*,
which is IMHO not acceptable. In particular, why should a non-threaded
app expect to have to change to deal with this issue? But we can't
safely build a thread-safe libpq.so for general use if it breaks
non-threaded apps that haven't been changed.

As for the efficiency argument, we have been doing two pqsignal()s per
send() for years and years; I see no reason to think this is worse.

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: libpq and psql not on same page about SIGPIPE

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Yes, that is the logic in my patch, except that I don't check errno, I
just call sigpending().

No, that's wrong: if there is a pending SIGPIPE that belongs to the
outer app, you'd clear it.

There are a few writes and it seemed impossible
to check them all.

Hmm? There is only one place this needs to be done, namely
pqsecure_write.

BTW, have you posted the patch yet or are you still working on it?
The mail server seems a bit flaky today ...

regards, tom lane

#13Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#12)
1 attachment(s)
Re: [HACKERS] libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Yes, that is the logic in my patch, except that I don't check errno, I
just call sigpending().

No, that's wrong: if there is a pending SIGPIPE that belongs to the
outer app, you'd clear it.

True, but I documented that in the patch.

There are a few writes and it seemed impossible
to check them all.

Hmm? There is only one place this needs to be done, namely
pqsecure_write.

Look also in fe-print.c. Looks like a lot of popen writes in there.
I can do it but it will be harder.

BTW, have you posted the patch yet or are you still working on it?
The mail server seems a bit flaky today ...

OK, patch attached. I already sent it but who knows what happened to
it.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/pgpatches/sigpipetext/plainDownload
Index: configure
===================================================================
RCS file: /cvsroot/pgsql/configure,v
retrieving revision 1.409
diff -c -c -r1.409 configure
*** configure	30 Nov 2004 06:13:03 -0000	1.409
--- configure	1 Dec 2004 22:53:58 -0000
***************
*** 17431,17436 ****
--- 17431,17448 ----
  fi
  HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals
  
+ if test "$pgac_cv_func_posix_signals" != yes -a "$enable_thread_safety" = yes; then
+   { { echo "$as_me:$LINENO: error:
+ *** Thread-safety requires POSIX signals, which are not supported by your
+ *** operating system.
+ " >&5
+ echo "$as_me: error:
+ *** Thread-safety requires POSIX signals, which are not supported by your
+ *** operating system.
+ " >&2;}
+    { (exit 1); exit 1; }; }
+ fi
+ 
  if test $ac_cv_func_fseeko = yes; then
  # Check whether --enable-largefile or --disable-largefile was given.
  if test "${enable_largefile+set}" = set; then
Index: configure.in
===================================================================
RCS file: /cvsroot/pgsql/configure.in,v
retrieving revision 1.387
diff -c -c -r1.387 configure.in
*** configure.in	30 Nov 2004 06:13:04 -0000	1.387
--- configure.in	1 Dec 2004 22:54:11 -0000
***************
*** 1174,1179 ****
--- 1174,1186 ----
  
  
  PGAC_FUNC_POSIX_SIGNALS
+ if test "$pgac_cv_func_posix_signals" != yes -a "$enable_thread_safety" = yes; then
+   AC_MSG_ERROR([
+ *** Thread-safety requires POSIX signals, which are not supported by your
+ *** operating system.
+ ])
+ fi
+ 
  if test $ac_cv_func_fseeko = yes; then
  AC_SYS_LARGEFILE
  fi
Index: doc/src/sgml/libpq.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/libpq.sgml,v
retrieving revision 1.169
diff -c -c -r1.169 libpq.sgml
*** doc/src/sgml/libpq.sgml	27 Nov 2004 21:56:04 -0000	1.169
--- doc/src/sgml/libpq.sgml	1 Dec 2004 22:54:45 -0000
***************
*** 3955,3975 ****
  </para>
  
  <para>
! <application>libpq</application> must ignore <literal>SIGPIPE</> signals
! generated internally by <function>send()</> calls to backend processes.
! When <productname>PostgreSQL</> is configured without
! <literal>--enable-thread-safety</>, <application>libpq</> sets
! <literal>SIGPIPE</> to <literal>SIG_IGN</> before each
! <function>send()</> call and restores the original signal handler after
! completion. When <literal>--enable-thread-safety</> is used,
! <application>libpq</> installs its own <literal>SIGPIPE</> handler
! before the first database connection.  This handler uses thread-local
! storage to determine if a <literal>SIGPIPE</> signal has been generated
! by a libpq <function>send()</>. If an application wants to install
! its own <literal>SIGPIPE</> signal handler, it should call
! <function>PQinSend()</> to determine if it should ignore the
! <literal>SIGPIPE</> signal. This function is available in both
! thread-safe and non-thread-safe versions of <application>libpq</>.
  </para>
  
  <para>
--- 3955,3964 ----
  </para>
  
  <para>
! <application>libpq</application> blocks and discards <literal>SIGPIPE</>
! signals generated internally by <function>send()</> calls to the backend 
! process. Therefore, applications should not expect blocked <literal>SIGPIPE</>
! signals to remain across <application>libpq</application> function calls.
  </para>
  
  <para>
Index: doc/src/sgml/ref/copy.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/copy.sgml,v
retrieving revision 1.60
diff -c -c -r1.60 copy.sgml
*** doc/src/sgml/ref/copy.sgml	27 Nov 2004 21:56:05 -0000	1.60
--- doc/src/sgml/ref/copy.sgml	1 Dec 2004 22:54:57 -0000
***************
*** 3,8 ****
--- 3,9 ----
  PostgreSQL documentation
  -->
  
+ 
  <refentry id="SQL-COPY">
   <refmeta>
    <refentrytitle id="sql-copy-title">COPY</refentrytitle>
Index: src/interfaces/libpq/fe-connect.c
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-connect.c,v
retrieving revision 1.289
diff -c -c -r1.289 fe-connect.c
*** src/interfaces/libpq/fe-connect.c	30 Oct 2004 23:11:26 -0000	1.289
--- src/interfaces/libpq/fe-connect.c	1 Dec 2004 22:55:18 -0000
***************
*** 865,879 ****
  	const char *node = NULL;
  	int			ret;
  
- #ifdef ENABLE_THREAD_SAFETY
- #ifndef WIN32
- 	static pthread_once_t check_sigpipe_once = PTHREAD_ONCE_INIT;
- 
- 	/* Check only on first connection request */
- 	pthread_once(&check_sigpipe_once, pq_check_sigpipe_handler);
- #endif
- #endif
- 
  	if (!conn)
  		return 0;
  
--- 865,870 ----
Index: src/interfaces/libpq/fe-print.c
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-print.c,v
retrieving revision 1.55
diff -c -c -r1.55 fe-print.c
*** src/interfaces/libpq/fe-print.c	9 Nov 2004 15:57:57 -0000	1.55
--- src/interfaces/libpq/fe-print.c	1 Dec 2004 22:55:25 -0000
***************
*** 91,97 ****
  		int			total_line_length = 0;
  		int			usePipe = 0;
  		char	   *pagerenv;
! 
  #if !defined(ENABLE_THREAD_SAFETY) && !defined(WIN32)
  		pqsigfunc	oldsigpipehandler = NULL;
  #endif
--- 91,100 ----
  		int			total_line_length = 0;
  		int			usePipe = 0;
  		char	   *pagerenv;
! #ifdef ENABLE_THREAD_SAFETY
! 		sigset_t	osigset;
! 		bool		sigpipe_masked = false;
! #endif
  #if !defined(ENABLE_THREAD_SAFETY) && !defined(WIN32)
  		pqsigfunc	oldsigpipehandler = NULL;
  #endif
***************
*** 189,195 ****
  				{
  					usePipe = 1;
  #ifdef ENABLE_THREAD_SAFETY
! 					pthread_setspecific(pq_thread_in_send, "t");
  #else
  #ifndef WIN32
  					oldsigpipehandler = pqsignal(SIGPIPE, SIG_IGN);
--- 192,199 ----
  				{
  					usePipe = 1;
  #ifdef ENABLE_THREAD_SAFETY
! 					pq_block_sigpipe(&osigset);
! 					sigpipe_masked = true;
  #else
  #ifndef WIN32
  					oldsigpipehandler = pqsignal(SIGPIPE, SIG_IGN);
***************
*** 311,317 ****
  			pclose(fout);
  #endif
  #ifdef ENABLE_THREAD_SAFETY
! 			pthread_setspecific(pq_thread_in_send, "f");
  #else
  #ifndef WIN32
  			pqsignal(SIGPIPE, oldsigpipehandler);
--- 315,322 ----
  			pclose(fout);
  #endif
  #ifdef ENABLE_THREAD_SAFETY
! 			if (sigpipe_masked)
! 				pq_reset_sigpipe(&osigset);
  #else
  #ifndef WIN32
  			pqsignal(SIGPIPE, oldsigpipehandler);
Index: src/interfaces/libpq/fe-secure.c
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/fe-secure.c,v
retrieving revision 1.57
diff -c -c -r1.57 fe-secure.c
*** src/interfaces/libpq/fe-secure.c	20 Nov 2004 00:35:13 -0000	1.57
--- src/interfaces/libpq/fe-secure.c	1 Dec 2004 22:55:31 -0000
***************
*** 152,163 ****
  static SSL_CTX *SSL_context = NULL;
  #endif
  
- #ifdef ENABLE_THREAD_SAFETY
- static void sigpipe_handler_ignore_send(int signo);
- pthread_key_t pq_thread_in_send = 0;	/* initializer needed on Darwin */
- static pqsigfunc pq_pipe_handler;
- #endif
- 
  /* ------------------------------------------------------------ */
  /*						 Hardcoded values						*/
  /* ------------------------------------------------------------ */
--- 152,157 ----
***************
*** 379,387 ****
  pqsecure_write(PGconn *conn, const void *ptr, size_t len)
  {
  	ssize_t		n;
! 
  #ifdef ENABLE_THREAD_SAFETY
! 	pthread_setspecific(pq_thread_in_send, "t");
  #else
  #ifndef WIN32
  	pqsigfunc	oldsighandler = pqsignal(SIGPIPE, SIG_IGN);
--- 373,383 ----
  pqsecure_write(PGconn *conn, const void *ptr, size_t len)
  {
  	ssize_t		n;
! 	
  #ifdef ENABLE_THREAD_SAFETY
! 	sigset_t	osigmask;
! 
! 	pq_block_sigpipe(&osigmask);
  #else
  #ifndef WIN32
  	pqsigfunc	oldsighandler = pqsignal(SIGPIPE, SIG_IGN);
***************
*** 454,460 ****
  		n = send(conn->sock, ptr, len, 0);
  
  #ifdef ENABLE_THREAD_SAFETY
! 	pthread_setspecific(pq_thread_in_send, "f");
  #else
  #ifndef WIN32
  	pqsignal(SIGPIPE, oldsighandler);
--- 450,456 ----
  		n = send(conn->sock, ptr, len, 0);
  
  #ifdef ENABLE_THREAD_SAFETY
! 	pq_reset_sigpipe(&osigmask);
  #else
  #ifndef WIN32
  	pqsignal(SIGPIPE, oldsighandler);
***************
*** 1216,1280 ****
  }
  #endif   /* USE_SSL */
  
- 
  #ifdef ENABLE_THREAD_SAFETY
- #ifndef WIN32
  /*
!  *	Check SIGPIPE handler and perhaps install our own.
   */
! void
! pq_check_sigpipe_handler(void)
  {
! 	pthread_key_create(&pq_thread_in_send, NULL);
! 
! 	/*
! 	 * Find current pipe handler and chain on to it.
! 	 */
! 	pq_pipe_handler = pqsignalinquire(SIGPIPE);
! 	pqsignal(SIGPIPE, sigpipe_handler_ignore_send);
! }
! 
! /*
!  *	Threaded SIGPIPE signal handler
   */
! void
! sigpipe_handler_ignore_send(int signo)
  {
! 	/*
! 	 * If we have gotten a SIGPIPE outside send(), chain or exit if we are
! 	 * at the end of the chain. Synchronous signals are delivered to the
! 	 * thread that caused the signal.
! 	 */
! 	if (!PQinSend())
  	{
! 		if (pq_pipe_handler == SIG_DFL) /* not set by application */
! 			exit(128 + SIGPIPE);	/* typical return value for SIG_DFL */
! 		else
! 			(*pq_pipe_handler) (signo); /* call original handler */
  	}
- }
- #endif
- #endif
- 
- /*
-  *	Indicates whether the current thread is in send()
-  *	For use by SIGPIPE signal handlers;  they should
-  *	ignore SIGPIPE when libpq is in send().  This means
-  *	that the backend has died unexpectedly.
-  */
- pqbool
- PQinSend(void)
- {
- #ifdef ENABLE_THREAD_SAFETY
- 	return (pthread_getspecific(pq_thread_in_send) /* has it been set? */ &&
- 			*(char *) pthread_getspecific(pq_thread_in_send) == 't') ? true : false;
- #else
  
! 	/*
! 	 * No threading: our code ignores SIGPIPE around send(). Therefore, we
! 	 * can't be in send() if we are checking from a SIGPIPE signal
! 	 * handler.
! 	 */
! 	return false;
! #endif
  }
--- 1212,1264 ----
  }
  #endif   /* USE_SSL */
  
  #ifdef ENABLE_THREAD_SAFETY
  /*
!  *	Block SIGPIPE for this thread.  This prevents send()/write() from exiting
!  *	the application.
   */
! int
! pq_block_sigpipe(sigset_t *osigset)
  {
! 	sigset_t sigpipe_sigset;
! 	
! 	sigemptyset(&sigpipe_sigset);
! 	sigaddset(&sigpipe_sigset, SIGPIPE);
! 
! 	/* Block SIGPIPE and save previous mask for later reset */
! 	return pthread_sigmask(SIG_BLOCK, &sigpipe_sigset, osigset);
! }
! 	
! /*
!  *	Discard any pending SIGPIPE and reset the signal mask.
!  *	We might be discarding a blocked SIGPIPE that we didn't generate,
!  *	but we document that you can't keep blocked SIGPIPE calls across
!  *	libpq function calls.
   */
! int
! pq_reset_sigpipe(sigset_t *osigset)
  {
! 	int	signo;
! 	sigset_t sigset;
! 
! 	/* Is SIGPIPE pending? */
! 	if (sigpending(&sigset) != 0)
! 		return -1;
! 
! 	if (sigismember(&sigset, SIGPIPE))
  	{
! 		sigset_t sigpipe_sigset;
! 		
! 		sigemptyset(&sigpipe_sigset);
! 		sigaddset(&sigpipe_sigset, SIGPIPE);
! 
! 		/* Discard pending and blocked SIGPIPE */
! 		sigwait(&sigpipe_sigset, &signo);
! 		if (signo != SIGPIPE)
! 			return -1;
  	}
  
! 	/* Restore saved block mask */
! 	return pthread_sigmask(SIG_SETMASK, osigset, NULL);
  }
+ #endif
Index: src/interfaces/libpq/libpq-fe.h
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/libpq-fe.h,v
retrieving revision 1.113
diff -c -c -r1.113 libpq-fe.h
*** src/interfaces/libpq/libpq-fe.h	30 Oct 2004 23:11:27 -0000	1.113
--- src/interfaces/libpq/libpq-fe.h	1 Dec 2004 22:55:33 -0000
***************
*** 497,508 ****
  
  /* === in fe-secure.c === */
  
- /*
-  *	Indicates whether the libpq thread is in send().
-  *	Used to ignore SIGPIPE if thread is in send().
-  */
- extern pqbool PQinSend(void);
- 
  #ifdef __cplusplus
  }
  #endif
--- 497,502 ----
Index: src/interfaces/libpq/libpq-int.h
===================================================================
RCS file: /cvsroot/pgsql/src/interfaces/libpq/libpq-int.h,v
retrieving revision 1.96
diff -c -c -r1.96 libpq-int.h
*** src/interfaces/libpq/libpq-int.h	30 Oct 2004 23:11:27 -0000	1.96
--- src/interfaces/libpq/libpq-int.h	1 Dec 2004 22:55:35 -0000
***************
*** 31,36 ****
--- 31,37 ----
  
  #ifdef ENABLE_THREAD_SAFETY
  #include <pthread.h>
+ #include <signal.h>
  #endif
  
  #ifdef WIN32_CLIENT_ONLY
***************
*** 475,489 ****
  extern ssize_t pqsecure_read(PGconn *, void *ptr, size_t len);
  extern ssize_t pqsecure_write(PGconn *, const void *ptr, size_t len);
  
- #ifdef ENABLE_THREAD_SAFETY
- extern void pq_check_sigpipe_handler(void);
- extern pthread_key_t pq_thread_in_send;
- #endif
- 
  #ifdef USE_SSL
  extern bool pq_initssllib;
  #endif
  
  /*
   * this is so that we can check if a connection is non-blocking internally
   * without the overhead of a function call
--- 476,490 ----
  extern ssize_t pqsecure_read(PGconn *, void *ptr, size_t len);
  extern ssize_t pqsecure_write(PGconn *, const void *ptr, size_t len);
  
  #ifdef USE_SSL
  extern bool pq_initssllib;
  #endif
  
+ #ifdef ENABLE_THREAD_SAFETY
+ int pq_block_sigpipe(sigset_t *osigset);
+ int pq_reset_sigpipe(sigset_t *osigset);
+ #endif
+ 
  /*
   * this is so that we can check if a connection is non-blocking internally
   * without the overhead of a function call
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#13)
Re: [HACKERS] libpq and psql not on same page about SIGPIPE

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

No, that's wrong: if there is a pending SIGPIPE that belongs to the
outer app, you'd clear it.

True, but I documented that in the patch.

A documented bug is still a bug. The entire point of this change is to
not interfere with the behavior of the surrounding app, at least not
more than we absolutely have to; and we do not have to clear SIGPIPEs
that are certainly not ours.

regards, tom lane

#15Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#14)
Re: [HACKERS] libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Tom Lane wrote:

No, that's wrong: if there is a pending SIGPIPE that belongs to the
outer app, you'd clear it.

True, but I documented that in the patch.

A documented bug is still a bug. The entire point of this change is to
not interfere with the behavior of the surrounding app, at least not
more than we absolutely have to; and we do not have to clear SIGPIPEs
that are certainly not ours.

I am working on a new version that doesn't have this problem. Will post
soon.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#16Manfred Spraul
manfred@colorfullife.com
In reply to: Tom Lane (#11)
Re: libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

Not really: it only solves the problem *if you change the application*,
which is IMHO not acceptable. In particular, why should a non-threaded
app expect to have to change to deal with this issue? But we can't
safely build a thread-safe libpq.so for general use if it breaks
non-threaded apps that haven't been changed.

No. non-threaded apps do not need to change. The default is the old, 7.3
code: change the signal handler around the write calls. Which means that
non-threaded apps are guaranteed to work without any changes, regardless
of the libpq thread safety setting.
Threaded apps would have to change, but how many threaded apps use
libpq? They check their code anyway - either just add PQinitLib() or
review (and potentialy update) their signal handling code if it match
any of the gotchas of the transparent handling.

--
Manfred

--
Manfred

#17Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Manfred Spraul (#16)
Re: libpq and psql not on same page about SIGPIPE

Manfred Spraul wrote:

Tom Lane wrote:

Not really: it only solves the problem *if you change the application*,
which is IMHO not acceptable. In particular, why should a non-threaded
app expect to have to change to deal with this issue? But we can't
safely build a thread-safe libpq.so for general use if it breaks
non-threaded apps that haven't been changed.

No. non-threaded apps do not need to change. The default is the old, 7.3
code: change the signal handler around the write calls. Which means that
non-threaded apps are guaranteed to work without any changes, regardless
of the libpq thread safety setting.
Threaded apps would have to change, but how many threaded apps use
libpq? They check their code anyway - either just add PQinitLib() or
review (and potentialy update) their signal handling code if it match
any of the gotchas of the transparent handling.

So without the call we do the old non-threaded version of the code.
Yea, we could do that, but I think the most recent patch is clearer.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Manfred Spraul (#16)
Re: libpq and psql not on same page about SIGPIPE

Manfred Spraul <manfred@colorfullife.com> writes:

Tom Lane wrote:

Not really: it only solves the problem *if you change the application*,
which is IMHO not acceptable.

No. non-threaded apps do not need to change. The default is the old, 7.3
code: change the signal handler around the write calls. Which means that
non-threaded apps are guaranteed to work without any changes, regardless
of the libpq thread safety setting.

Hmm, so you propose that a thread-enabled library must contain both code
paths and choose between them on the fly? That seems like the worst of
all possible worlds --- it's not backwards compatible, it's therefore
unsafe, it's slow, *and* it'll be #ifdef hell to read.

On a platform that has pthread_sigmask, ISTM we can use that
unconditionally and it'll work whether the calling app is threaded or
not. We only fall back to the pqsignal method if we aren't supporting
threads. There's no need for a runtime test nor for an API change.

regards, tom lane

#19Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Tom Lane (#18)
Re: libpq and psql not on same page about SIGPIPE

Tom Lane wrote:

Manfred Spraul <manfred@colorfullife.com> writes:

Tom Lane wrote:

Not really: it only solves the problem *if you change the application*,
which is IMHO not acceptable.

No. non-threaded apps do not need to change. The default is the old, 7.3
code: change the signal handler around the write calls. Which means that
non-threaded apps are guaranteed to work without any changes, regardless
of the libpq thread safety setting.

Hmm, so you propose that a thread-enabled library must contain both code
paths and choose between them on the fly? That seems like the worst of
all possible worlds --- it's not backwards compatible, it's therefore
unsafe, it's slow, *and* it'll be #ifdef hell to read.

On a platform that has pthread_sigmask, ISTM we can use that
unconditionally and it'll work whether the calling app is threaded or
not. We only fall back to the pqsignal method if we aren't supporting
threads. There's no need for a runtime test nor for an API change.

Agreed. Manfred, I guess the big question is why not use something that
is automatic like I just applied?

Now that the patch is in, would someone run some threaded performance
tests and see if those pg_*_sigpipe routines are causing any performance
problems.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073