[PATCH] Add SIGCHLD catch to psql

Started by Stephen Frostover 15 years ago12 messages

sfrost@snowman.net

over 15 years ago

1 attachment(s)

Greetings,

Toying around with FETCH_COUNT today, I discovered that it didn't do
the #1 thing I really wanted to use it for- query large tables without
having to worry about LIMIT to see the first couple hundred records.
The reason is simple- psql ignores $PAGER exiting, which means that it
will happily continue pulling down the entire large table long after
you've stopped caring, which means you still have to wait forever.

The attached, admittedly quick hack, fixes this by having psql catch
SIGCHLD's using handle_sigint. I've tested this and it doesn't
appear to obviously break other cases where we have children (\!, for
example), since we're not going to be running a database query when
we're doing those, and if we are, and the child dies, we probably want
to *stop* anyway, similar to the $PAGER issue.

Another approach that I considered was fixing various things to deal
cleanly with write's failing to $PAGER (I presume the writes *were*
failing, since less was in a defunct state, but I didn't actually
test). This solution was simpler, faster to code and check, and alot
less invasive (or so it seemed to me at the time).

Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
current behaviour of completely ignoring $PAGER exiting is a bug.

Thanks,

Stephen

Attachments:

psql-sigchld.difftext/x-diff; charset=us-asciiDownload

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index f605c97..dcab436 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*************** NoticeProcessor(void *arg, const char *m
*** 188,194 ****
  /*
   * Code to support query cancellation
   *
!  * Before we start a query, we enable the SIGINT signal catcher to send a
   * cancel request to the backend. Note that sending the cancel directly from
   * the signal handler is safe because PQcancel() is written to make it
   * so. We use write() to report to stderr because it's better to use simple
--- 188,194 ----
  /*
   * Code to support query cancellation
   *
!  * Before we start a query, we enable SIGINT and SIGCHLD signals to send a
   * cancel request to the backend. Note that sending the cancel directly from
   * the signal handler is safe because PQcancel() is written to make it
   * so. We use write() to report to stderr because it's better to use simple
*************** NoticeProcessor(void *arg, const char *m
*** 208,213 ****
--- 208,218 ----
   * catcher to longjmp through sigint_interrupt_jmp.  We assume readline and
   * fgets are coded to handle possible interruption.  (XXX currently this does
   * not work on win32, so control-C is less useful there)
+  *
+  * SIGCHLD is also caught and handled the same to deal with cases where a user's
+  * PAGER or other child process exits.  Otherwise, we would just keep sending
+  * data to a dead/zombied process.  This won't typically matter except when
+  * FETCH_COUNT is used.
   */
  volatile bool sigint_interrupt_enabled = false;
  
*************** void
*** 259,264 ****
--- 264,272 ----
  setup_cancel_handler(void)
  {
  	pqsignal(SIGINT, handle_sigint);
+ 
+ 	/* Also send SIGCHLD signals, to catch cases where the user exits PAGER */
+ 	pqsignal(SIGCHLD, handle_sigint);
  }
  #else							/* WIN32 */

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Stephen Frost (#1)

Re: [PATCH] Add SIGCHLD catch to psql

Stephen Frost wrote:
-- Start of PGP signed section.

Greetings,

Toying around with FETCH_COUNT today, I discovered that it didn't do
the #1 thing I really wanted to use it for- query large tables without
having to worry about LIMIT to see the first couple hundred records.
The reason is simple- psql ignores $PAGER exiting, which means that it
will happily continue pulling down the entire large table long after
you've stopped caring, which means you still have to wait forever.

The attached, admittedly quick hack, fixes this by having psql catch
SIGCHLD's using handle_sigint. I've tested this and it doesn't
appear to obviously break other cases where we have children (\!, for
example), since we're not going to be running a database query when
we're doing those, and if we are, and the child dies, we probably want
to *stop* anyway, similar to the $PAGER issue.

Another approach that I considered was fixing various things to deal
cleanly with write's failing to $PAGER (I presume the writes *were*
failing, since less was in a defunct state, but I didn't actually
test). This solution was simpler, faster to code and check, and alot
less invasive (or so it seemed to me at the time).

Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
current behaviour of completely ignoring $PAGER exiting is a bug.

Plesae add this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view/inprogress

Thanks.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

David Fetter

david@fetter.org

over 15 years ago

In reply to: Bruce Momjian (#2)

Re: [PATCH] Add SIGCHLD catch to psql

On Fri, May 14, 2010 at 04:24:43PM -0400, Bruce Momjian wrote:

Stephen Frost wrote:
-- Start of PGP signed section.

Greetings,

Toying around with FETCH_COUNT today, I discovered that it didn't do
the #1 thing I really wanted to use it for- query large tables without
having to worry about LIMIT to see the first couple hundred records.
The reason is simple- psql ignores $PAGER exiting, which means that it
will happily continue pulling down the entire large table long after
you've stopped caring, which means you still have to wait forever.

The attached, admittedly quick hack, fixes this by having psql catch
SIGCHLD's using handle_sigint. I've tested this and it doesn't
appear to obviously break other cases where we have children (\!, for
example), since we're not going to be running a database query when
we're doing those, and if we are, and the child dies, we probably want
to *stop* anyway, similar to the $PAGER issue.

Another approach that I considered was fixing various things to deal
cleanly with write's failing to $PAGER (I presume the writes *were*
failing, since less was in a defunct state, but I didn't actually
test). This solution was simpler, faster to code and check, and alot
less invasive (or so it seemed to me at the time).

Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
current behaviour of completely ignoring $PAGER exiting is a bug.

Plesae add this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view/inprogress

Thanks.

Wouldn't this count as a bug fix?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Robert Haas

robertmhaas@gmail.com

over 15 years ago

In reply to: David Fetter (#3)

Re: [PATCH] Add SIGCHLD catch to psql

On Sat, May 15, 2010 at 7:46 PM, David Fetter <david@fetter.org> wrote:

Anyway, this makes FETCH_COUNT alot more useful, and, in my view, the
current behaviour of completely ignoring $PAGER exiting is a bug.

Plesae add this to the next commit-fest:

https://commitfest.postgresql.org/action/commitfest_view/inprogress

Thanks.

Wouldn't this count as a bug fix?

Possibly, but changes to signal handlers are pretty global and can
sometimes have surprising side effects. I'm all in favor of someone
reviewing the patch - any volunteers? One case to test might be
reading input from a file that contains \! escapes. More generally,
we need to consider every way that psql can get SIGCHLD and think
about whether this is the right behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Robert Haas (#4)

Re: [PATCH] Add SIGCHLD catch to psql

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, May 15, 2010 at 7:46 PM, David Fetter <david@fetter.org> wrote:

Wouldn't this count as a bug fix?

Possibly, but changes to signal handlers are pretty global and can
sometimes have surprising side effects. I'm all in favor of someone
reviewing the patch - any volunteers? One case to test might be
reading input from a file that contains \! escapes. More generally,
we need to consider every way that psql can get SIGCHLD and think
about whether this is the right behavior.

I think this will introduce far more bugs than it fixes. A saner
approach, which would also help for other corner cases such as
out-of-disk-space, would be to check for write failures on the output
file and abandon the query if any occur.

regards, tom lane

Bruce Momjian

bruce@momjian.us

over 15 years ago

In reply to: Tom Lane (#5)

Re: [PATCH] Add SIGCHLD catch to psql

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, May 15, 2010 at 7:46 PM, David Fetter <david@fetter.org> wrote:

Wouldn't this count as a bug fix?

Possibly, but changes to signal handlers are pretty global and can
sometimes have surprising side effects. I'm all in favor of someone
reviewing the patch - any volunteers? One case to test might be
reading input from a file that contains \! escapes. More generally,
we need to consider every way that psql can get SIGCHLD and think
about whether this is the right behavior.

I think this will introduce far more bugs than it fixes. A saner
approach, which would also help for other corner cases such as
out-of-disk-space, would be to check for write failures on the output
file and abandon the query if any occur.

Is this a TODO?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

Stephen Frost

sfrost@snowman.net

over 15 years ago

In reply to: Tom Lane (#5)

Re: [PATCH] Add SIGCHLD catch to psql

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

A saner
approach, which would also help for other corner cases such as
out-of-disk-space, would be to check for write failures on the output
file and abandon the query if any occur.

I had considered this, but I'm not sure we really need to catch *every*
write failure. Perhaps just catching if the '\n' at the end of a row
fails to be written out would be sufficient? Then turning around and
setting cancel_query might be enough.. I'll write that up and test if
it works.

Thanks,

Stephen

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Stephen Frost (#7)

Re: [PATCH] Add SIGCHLD catch to psql

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

A saner
approach, which would also help for other corner cases such as
out-of-disk-space, would be to check for write failures on the output
file and abandon the query if any occur.

I had considered this, but I'm not sure we really need to catch *every*
write failure. Perhaps just catching if the '\n' at the end of a row
fails to be written out would be sufficient?

If you're combining this with the FETCH_COUNT logic then it seems like
it'd be sufficient to check ferror(fout) once per fetch chunk, and just
fall out of that loop then. I don't want psql issuing query cancels
on its own authority, either.

regards, tom lane

Stephen Frost

sfrost@snowman.net

over 15 years ago

In reply to: Tom Lane (#8)

1 attachment(s)

Re: [PATCH] Add SIGCHLD catch to psql

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

If you're combining this with the FETCH_COUNT logic then it seems like
it'd be sufficient to check ferror(fout) once per fetch chunk, and just
fall out of that loop then. I don't want psql issuing query cancels
on its own authority, either.

Attached is a patch that just checks the result from the existing
fflush() inside the FETCH_COUNT loop and drops out of that loop if we
get an error from it.

Thanks!

Stephen

Attachments:

psql-check-fflush.difftext/x-diff; charset=us-asciiDownload

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index f605c97..fae1e5a 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*************** ExecQueryUsingCursor(const char *query, 
*** 982,987 ****
--- 982,988 ----
  	char		fetch_cmd[64];
  	instr_time	before,
  				after;
+ 	int			flush_error;
  
  	*elapsed_msec = 0;
  
*************** ExecQueryUsingCursor(const char *query, 
*** 1098,1106 ****
  
  		/*
  		 * Make sure to flush the output stream, so intermediate results are
! 		 * visible to the client immediately.
  		 */
! 		fflush(pset.queryFout);
  
  		/* after the first result set, disallow header decoration */
  		my_popt.topt.start_table = false;
--- 1099,1109 ----
  
  		/*
  		 * Make sure to flush the output stream, so intermediate results are
! 		 * visible to the client immediately.  We check the results because
! 		 * if the pager dies/exits/etc, there's no sense throwing more data
! 		 * at it.
  		 */
! 		flush_error = fflush(pset.queryFout);
  
  		/* after the first result set, disallow header decoration */
  		my_popt.topt.start_table = false;
*************** ExecQueryUsingCursor(const char *query, 
*** 1108,1114 ****
  
  		PQclear(results);
  
! 		if (ntuples < pset.fetch_count || cancel_pressed)
  			break;
  	}
  
--- 1111,1117 ----
  
  		PQclear(results);
  
! 		if (ntuples < pset.fetch_count || cancel_pressed || flush_error)
  			break;
  	}

#10

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Stephen Frost (#9)

Re: [PATCH] Add SIGCHLD catch to psql

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

If you're combining this with the FETCH_COUNT logic then it seems like
it'd be sufficient to check ferror(fout) once per fetch chunk, and just
fall out of that loop then. I don't want psql issuing query cancels
on its own authority, either.

Attached is a patch that just checks the result from the existing
fflush() inside the FETCH_COUNT loop and drops out of that loop if we
get an error from it.

I thought it might be about that simple once you went at it the right
way ;-). However, I'd suggest checking ferror(pset.queryFout) as well
as the fflush result. It's not clear to me whether fflush should be
counted on to report an error that actually occurred in a previous
fwrite. (It's also unclear why fflush isn't documented to set the stream
error indicator on failure, but it isn't.)

regards, tom lane

#11

Stephen Frost

sfrost@snowman.net

over 15 years ago

In reply to: Tom Lane (#10)

1 attachment(s)

Re: [PATCH] Add SIGCHLD catch to psql

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

Attached is a patch that just checks the result from the existing
fflush() inside the FETCH_COUNT loop and drops out of that loop if we
get an error from it.

I thought it might be about that simple once you went at it the right
way ;-). However, I'd suggest checking ferror(pset.queryFout) as well
as the fflush result. It's not clear to me whether fflush should be
counted on to report an error that actually occurred in a previous
fwrite. (It's also unclear why fflush isn't documented to set the stream
error indicator on failure, but it isn't.)

Sure, I can add the ferror() check. Patch attached.

My man page (Debian/Linux) has this to say about fflush():

DESCRIPTION
The function fflush() forces a write of all user-space buffered
data for the given output or update stream via the stream’s
underlying write function. The open status of the stream
is unaffected.

If the stream argument is NULL, fflush() flushes all open output
streams.

For a non-locking counterpart, see unlocked_stdio(3).

RETURN VALUE
Upon successful completion 0 is returned. Otherwise, EOF is
returned and the global variable errno is set to indicate the
error.

ERRORS
EBADF Stream is not an open stream, or is not open for writing.

The function fflush() may also fail and set errno for any of the
errors specified for the routine write(2).

CONFORMING TO
C89, C99.

Thanks,

Stephen

Attachments:

psql-check-fflush-20100517-2.difftext/x-diff; charset=us-asciiDownload

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index f605c97..25340f6 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
*************** ExecQueryUsingCursor(const char *query, 
*** 982,987 ****
--- 982,988 ----
  	char		fetch_cmd[64];
  	instr_time	before,
  				after;
+ 	int			flush_error;
  
  	*elapsed_msec = 0;
  
*************** ExecQueryUsingCursor(const char *query, 
*** 1098,1106 ****
  
  		/*
  		 * Make sure to flush the output stream, so intermediate results are
! 		 * visible to the client immediately.
  		 */
! 		fflush(pset.queryFout);
  
  		/* after the first result set, disallow header decoration */
  		my_popt.topt.start_table = false;
--- 1099,1109 ----
  
  		/*
  		 * Make sure to flush the output stream, so intermediate results are
! 		 * visible to the client immediately.  We check the results because
! 		 * if the pager dies/exits/etc, there's no sense throwing more data
! 		 * at it.
  		 */
! 		flush_error = fflush(pset.queryFout);
  
  		/* after the first result set, disallow header decoration */
  		my_popt.topt.start_table = false;
*************** ExecQueryUsingCursor(const char *query, 
*** 1108,1114 ****
  
  		PQclear(results);
  
! 		if (ntuples < pset.fetch_count || cancel_pressed)
  			break;
  	}
  
--- 1111,1124 ----
  
  		PQclear(results);
  
! 		/* Check if we are at the end, if a cancel was pressed, or if
! 		 * there were any errors either trying to flush out the results,
! 		 * or more generally on the output stream at all.  If we hit any
! 		 * errors writing things to the stream, we presume $PAGER has
! 		 * disappeared and stop bothring to pull down more data.
! 		 */
! 		if (ntuples < pset.fetch_count || cancel_pressed || flush_error ||
! 			ferror(pset.queryFout))
  			break;
  	}

#12

Tom Lane

tgl@sss.pgh.pa.us

over 15 years ago

In reply to: Stephen Frost (#11)

Re: [PATCH] Add SIGCHLD catch to psql

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I thought it might be about that simple once you went at it the right
way ;-). However, I'd suggest checking ferror(pset.queryFout) as well
as the fflush result.

Sure, I can add the ferror() check. Patch attached.

This seemed pretty small and uncontroversial, so I went ahead and
committed it for 9.0. I rearranged the order of operations a bit to
make it seem more coherent, and also added an initial clearerr() just
to forestall problems if stdout had the error flag set for some reason.

regards, tom lane