Escaping from blocked send() reprised.
Hello, I have received inquiries related to blocked communication
several times for these weeks with different symptoms. Then I
found this message from archive,
Subject: Escaping a blocked sendto() syscall without causing a restart
Mr. Tom Lane gave a comment replying it,
Offhand it looks to me like most signals would kick the backend off the
send() call ... but it would loop right back and try again. See
internal_flush() in pqcomm.c. (If you're using SSL, this diagnosis
may or may not apply.)We can't do anything except repeat the send attempt if the client
connection is to be kept in a sane state.
(snipped)
And I'm not at all sure if we could get it to work in SSL mode...
That's true for timeouts that should continue the connection,
say, statement_timeout, but focusing on intentional backend
termination, I think it does no harm to break it up abruptly,
even if it was on SSL. On the other hand it seems still
preferable to keep a connection when not blocked. The following
expression would detects such a blocking state at just before
next send(2) after the previous try exited by signals.
(ProcDiePending && select(1, NULL, fd, NULL, '1 sec') == 0)
Finally, pg_terminate_backend() works even when send is blocked
for both SSL and non-SSL connections after 1 second delay with
this patch (break_socket_blocking_on_termination_v1.patch).
Nevetheless, of course statement_timeout cannot become effective
by this method since it breaks the consistency in the client
protocol. It needs change in client protocol to have "out of
band" mechanism or something, maybe.
Any suggestions?
Attached patches are:
- break_socket_blocking_on_termination_v1.patch : The patch to
break blocked state of send(2) for pg_terminate_backend().
- socket_block_test.patch : debug printing and changing send
buffer of libpq for reproducing the blocked situation.
Some point of discussion follows,
==== Discussion about the appropriateness of looking into
ProcDiePending there and calling CHECK_FOR_INTERRUPTS()
seeing it.
I have somewhat uneasiness of these things, but what we can at
most seems to be replacing ProcDiePending here with some another
variable, say ImmediatelyExitFromBlockedState, and somehow go
upstairs through normal return path. Additional Try-Catch seems
can do that but it looks no benefit for the added complexity..
==== Discussion on breaking up connetion especially for SSL
Breaking an SSL connection up in my_sock_write() cause the
following message on client side if it still lives and resumes to
receive from the connection, this seems to show that the client
handles the event properly.
| SSL SYSCALL error: EOF detected
| The connection to the server was lost. Attempting reset: Succeeded.
==== Discussion on reliability of select(2)
This method is not a perfect solution, since the select(2)
sometimes gives a wrong oracle about wheather the follwing
send(2) will be blocked.
Even so, as far as I see, select(2) just after exiting from
blocked send(2) by signal seems always says 'write will be
blocked', so what this patch does seems to save most cases except
when the any amount of socket buffer is vacated just before the
following select. The second chance to exit from blocked send(2)
won't come after this(, before one more pg_terminate_backend() ?).
Removing the select(2) from the condition (that is,
CHECK_FOR_INTERRUPTS() is called always ProcDiePending is true)
prevents such a possibility, in exchange for killing 'really
live' connection but IMHO it's no problem on intentional server
termination.
More reliable measure for this would be non-blocking IO but it
changes more of the code.
==== Reproducing the situation.
Another possible question would be about the possibility of such
blocking, or how to reproduce the situation. I found that send(2)
on CentOS6.5 somehow sends successfully, for most cases, the
remaining data at the retry after exiting by signal during being
blocked with buffer full, in spite of no change in environment.
So reproducing the stucked situation is rather difficult on the
server as is. But such situation would be reproduced quite easily
with some cheat, that is, enlarging PQ send buffer, say by ten
times.
Applying the attached testing patch (socket_block_test.patch),
the following steps will make the stucked situation.
1. Do a select which returns big result and enter Ctrl-Z just
after invoking.
cl> $ psql -h localhost postgres
cl> postgres=# select 1 from generate_series(0, 9999999);
cl> ^Z
cl> [4]+ Stopped psql -h localhost postgres
2. Watch the server to stuck.
The server starts to print lines like following to console
after a while, then stops. The number enclosed by the square
brackets is PID of the server.
sv> #### [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0
3. Do pg_terminate_backend().
cl> $ psql postgres -c "select pg_terminate_backend(8809)"
The server will stuck like follows, PID=8811 is the another
session made by the command just above.
sv> #### [8809] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0
sv> #### [8811] (bare) rest = 0 / 327 bytes, ProcDiePending = 0
sv> #### [8809] (bare) rest = 500 / 81920 bytes, ProcDiePending = 1
sv> #### [8811] (bare) rest = 0 / 78 bytes, ProcDiePending = 0
The server 8809 is blocked during sending the remaining 500
bytes and won't come back forever except for SIGKILL, or
possible data reading on the client (fg does it).
cl> $ fg
sv> #### [8809] (bare) rest = 0 / 500 bytes, ProcDiePending = 1
sv> FATAL: terminating connection due to administrator command
sv> STATEMENT: select 1 from generate_series(0, 9999999);
sv> #### [8809] (bare) rest = 0 / 116 bytes, ProcDiePending = 0
sv> #### [8883] (bare) rest = 0 / 327 bytes, ProcDiePending = 0
If you don't see the situation to occur, changing the value of
select clause (by its length, not by value:) would be
effective, or entering Ctrl-Z after some debug output also
would be effective.
For SSL connections, the debug output looks like the following,
sv> #### [20064] (bare) rest = 0 / 81920 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 0 / 16413 bytes, ProcDiePending = 0
sv> #### [20064] (SSL) rest = 980 / 16413 bytes, ProcDiePending = 1
sv> #### [20064] (SSL) rest = 0 / 980 bytes, ProcDiePending = 1
sv> #### [20064] (SSL) rest = 1029 / 16413 bytes, ProcDiePending = 1
"bare" here in turn means the status of SSL_write and "SSL"
means the status of the underlying 'bare' socket of SSL
connection. (Sorry for the confising labelings..)
The (bare) line above is not corresponding to the following
bunch of (SSL) lines, but its precedents. At the fifth line,
send(2) exits by signal issued by pg_teminate_backend() then
retry (somehow) successfully at sixth line but SSL layer gave
another 16413 bytes and only 1029 bytes of that is sent by the
first try and the server stucked at the second try for the
seventh line. The control doesn't come back to secure_write()
during this sequence.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
On Mon, Jun 30, 2014 at 4:13 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello, I have received inquiries related to blocked communication
several times for these weeks with different symptoms. Then I
found this message from archive,Subject: Escaping a blocked sendto() syscall without causing a restart
Mr. Tom Lane gave a comment replying it,
Offhand it looks to me like most signals would kick the backend off the
send() call ... but it would loop right back and try again. See
internal_flush() in pqcomm.c. (If you're using SSL, this diagnosis
may or may not apply.)We can't do anything except repeat the send attempt if the client
connection is to be kept in a sane state.(snipped)
And I'm not at all sure if we could get it to work in SSL mode...
That's true for timeouts that should continue the connection,
say, statement_timeout, but focusing on intentional backend
termination, I think it does no harm to break it up abruptly,
even if it was on SSL. On the other hand it seems still
preferable to keep a connection when not blocked. The following
expression would detects such a blocking state at just before
next send(2) after the previous try exited by signals.(ProcDiePending && select(1, NULL, fd, NULL, '1 sec') == 0)
Finally, pg_terminate_backend() works even when send is blocked
for both SSL and non-SSL connections after 1 second delay with
this patch (break_socket_blocking_on_termination_v1.patch).Nevetheless, of course statement_timeout cannot become effective
by this method since it breaks the consistency in the client
protocol. It needs change in client protocol to have "out of
band" mechanism or something, maybe.Any suggestions?
You should probably add your patch here, so it doesn't get forgotten about:
https://commitfest.postgresql.org/action/commitfest_view/open
We're focused on reviewing patches for the current CommitFest, so your
patch might not get attention right away. A couple of general
thoughts on this topic:
1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR.... and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.
2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what "a
reasonable period of time" means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, The replies follow are mainly as a memo for myself so
please don't be bothered to answer until the time comes.
At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=Wc8PnQ@mail.gmail.com>
You should probably add your patch here, so it doesn't get forgotten about:
https://commitfest.postgresql.org/action/commitfest_view/open
Ok, I'll put this there.
We're focused on reviewing patches for the current CommitFest, so your
patch might not get attention right away. A couple of general
thoughts on this topic:
Thank you for suggestions. I'll consider on them.
1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR.... and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.
man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.
2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what "a
reasonable period of time" means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.
I think there's no such a reasonable time. The behavior might
should be determined from another point.. On alternative would be
let pg_terminate_backend() have a parameter instructing force
shutodwn (how to propagate it?), or make a forced shutdown on
duplicate invocation of pg_terminate_backend().
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 30, 2014 at 11:26 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what "a
reasonable period of time" means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.I think there's no such a reasonable time. The behavior might
should be determined from another point.. On alternative would be
let pg_terminate_backend() have a parameter instructing force
shutodwn (how to propagate it?), or make a forced shutdown on
duplicate invocation of pg_terminate_backend().
Well, I think that when people call pg_terminate_backend() just once,
they expect it to kill the target backend. I think people will
tolerate a short delay, like a few seconds; after all, there's no
guarantee, even today, that the backend will hit a
CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds. But
they are not going to want to have to take a second action to kill the
backend - killing it once should be sufficient.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote:
1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR.... and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.
Whether send() returns EINTR or not depends on whether the signal has
been marked restartable or not. This is configurable per signal, see
sigaction(). If the signal is marked to restart, the kernel returns
ERESTARTHAND (IIRC) and the libc will redo the call internally.
Default BSD does not return EINTR normally, but supports sigaction().
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.
-- Arthur Schopenhauer
Hello,
At Tue, 1 Jul 2014 21:21:38 +0200, Martijn van Oosterhout <kleptog@svana.org> wrote in <20140701192138.GB20140@svana.org>
On Tue, Jul 01, 2014 at 12:26:43PM +0900, Kyotaro HORIGUCHI wrote:
1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR.... and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.Whether send() returns EINTR or not depends on whether the signal has
been marked restartable or not. This is configurable per signal, see
sigaction(). If the signal is marked to restart, the kernel returns
ERESTARTHAND (IIRC) and the libc will redo the call internally.
Wow, thank you for detailed information. I'll study that and take
it into future discussion.
Default BSD does not return EINTR normally, but supports sigaction().
I guess it is for easiness-to-keep-compatibility, seems
reasonable.
have a nice day,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, thank you for keeping this discussion moving.
I think there's no such a reasonable time. The behavior might
should be determined from another point.. On alternative would be
let pg_terminate_backend() have a parameter instructing force
shutodwn (how to propagate it?), or make a forced shutdown on
duplicate invocation of pg_terminate_backend().Well, I think that when people call pg_terminate_backend() just once,
they expect it to kill the target backend. I think people will
tolerate a short delay, like a few seconds; after all, there's no
guarantee, even today, that the backend will hit a
CHECK_FOR_INTERRUPTS() in less than a few hundred milliseconds.
Sure.
But they are not going to want to have to take a second action
to kill the backend - killing it once should be sufficient.
Hmm, it sounds persuasive. Well, do you think they tolerate
-force option? (Even though its technical practicality is not
clear)
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote:
At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas <robertmhaas@gmail.com> wrote in <CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=Wc8PnQ@mail.gmail.com>
1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR.... and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.
We use a custom "write" routine with SSL_write, where we call send()
ourselves, so that's not a problem as long as we put the check in the
right place (in secure_raw_write(), after my recent SSL refactoring -
the patch needs to be rebased).
man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.
As the patch stands, there's a race condition: if the SIGTERM arrives
*before* the send() call, the send() won't return EINTR anyway. So
there's a chance that you still block. Calling pq_terminate_backend()
again will dislodge it (assuming send() returns with EINTR on signal),
but I don't think we want to define the behavior as "usually,
pq_terminate_backend() will kill a backend that's blocked on sending to
the client, but sometimes you have to call it twice (or more!) to really
kill it".
A more robust way is to set ImmediateInterruptOK before calling send().
That wouldn't let you send data that can be sent without blocking
though. For that, you could put the socket to non-blocking mode, and
sleep with select(), also waiting for the process' latch at the same
time (die() sets the latch, so that will wake up the select() if a
termination request arrives).
Is it actually safe to process the die-interrupt where send() is called?
ProcessInterrupts() does "ereport(FATAL, ...)", which will attempt to
send a message to the client. If that happens in the middle of
constructing some other message, that will violate the protocol.
2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what "a
reasonable period of time" means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.I think there's no such a reasonable time.
I agree it's pretty hard to define any reasonable timeout here. I think
it would be fine to just cut the connection; even if you don't block
while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere
higher in the stack and kill the connection almost as abruptly anyway.
(you can't violate the protocol, however)
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Sorry, I was absorbed by other tasks..
Thank you for reviewing thiis.
On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote:
At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas
<robertmhaas@gmail.com> wrote in
<CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=Wc8PnQ@mail.gmail.com>1. I think it's the case that there are platforms around where a
signal won't cause send() to return EINTR.... and I'd be entirely
unsurprised if SSL_write() doesn't necessarily return EINTR in that
case. I'm not sure what, if anything, we can do about that.We use a custom "write" routine with SSL_write, where we call send()
ourselves, so that's not a problem as long as we put the check in the
right place (in secure_raw_write(), after my recent SSL refactoring -
the patch needs to be rebased).man 2 send on FreeBSD has not description about EINTR.. And even
on linux, send won't return EINTR for most cases, at least I
haven't seen that. So send()=-1,EINTR seems to me as only an
equivalent of send() = 0. I have no idea about what the
implementer thought the difference is.As the patch stands, there's a race condition: if the SIGTERM arrives
*before* the send() call, the send() won't return EINTR anyway. So
there's a chance that you still block. Calling pq_terminate_backend()
again will dislodge it (assuming send() returns with EINTR on signal),
Yes, that window would'nt be extinguished without introducing
something more. EINTR is set only when nothing sent by the
call. So AFAIS the chance of getting EINTR is far small than
expectation.
but I don't think we want to define the behavior as "usually,
pq_terminate_backend() will kill a backend that's blocked on sending
to the client, but sometimes you have to call it twice (or more!) to
really kill it".
I agree that it is desirable behavior, if any measure to avoid
that. But I think it's better than doing kill -9 engulfing all
innocent backends.
A more robust way is to set ImmediateInterruptOK before calling
send(). That wouldn't let you send data that can be sent without
blocking though. For that, you could put the socket to non-blocking
mode, and sleep with select(), also waiting for the process' latch at
the same time (die() sets the latch, so that will wake up the select()
if a termination request arrives).
I condiered it but select() frequently (rather in most cases when
send() blocks by send buffer exhaustion) fails to predict that
following send() will be blocked. (If my memory is correct.) So
the final problem would be blocked send()...
Is it actually safe to process the die-interrupt where send() is
called? ProcessInterrupts() does "ereport(FATAL, ...)", which will
attempt to send a message to the client. If that happens in the middle
of constructing some other message, that will violate the protocol.
So I strongly agree to you if select() works as the impression
when reading the man document.
2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what "a
reasonable period of time" means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.I think there's no such a reasonable time.
I agree it's pretty hard to define any reasonable timeout here. I
think it would be fine to just cut the connection; even if you don't
block while sending, you'll probably reach a CHECK_FOR_INTERRUPT()
somewhere higher in the stack and kill the connection almost as
abruptly anyway. (you can't violate the protocol, however)
Yes, closing the blocked connection seems one of the most smarter
way, checking the occurred interrupt could avoid protocol
violation. But the problem for that is that there seems no means
to close sockets elsewhere the blocking handle. dup(2)'ed handle
cannot release the resource by only itself.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 08/26/2014 09:17 AM, Kyotaro HORIGUCHI wrote:
but I don't think we want to define the behavior as "usually,
pq_terminate_backend() will kill a backend that's blocked on sending
to the client, but sometimes you have to call it twice (or more!) to
really kill it".I agree that it is desirable behavior, if any measure to avoid
that. But I think it's better than doing kill -9 engulfing all
innocent backends.A more robust way is to set ImmediateInterruptOK before calling
send(). That wouldn't let you send data that can be sent without
blocking though. For that, you could put the socket to non-blocking
mode, and sleep with select(), also waiting for the process' latch at
the same time (die() sets the latch, so that will wake up the select()
if a termination request arrives).I condiered it but select() frequently (rather in most cases when
send() blocks by send buffer exhaustion) fails to predict that
following send() will be blocked. (If my memory is correct.) So
the final problem would be blocked send()...
My point was to put the socket in non-blocking mode, so that send() will
return immediately with EAGAIN instead of blocking, if the send buffer
is full. See WalSndWriteData for how that would work, it does something
similar.
Is it actually safe to process the die-interrupt where send() is
called? ProcessInterrupts() does "ereport(FATAL, ...)", which will
attempt to send a message to the client. If that happens in the middle
of constructing some other message, that will violate the protocol.So I strongly agree to you if select() works as the impression
when reading the man document.
Not sure what you mean, but the above is a fatal problem with the patch
right now, regardless of how you do the sleeping.
2. I think it would be reasonable to try to kill off the connection
without notifying the client if we're unable to send the data to the
client in a reasonable period of time. But I'm unsure what "a
reasonable period of time" means. This patch would basically do it
after no delay at all, which seems like it might be too aggressive.
However, I'm not sure.I think there's no such a reasonable time.
I agree it's pretty hard to define any reasonable timeout here. I
think it would be fine to just cut the connection; even if you don't
block while sending, you'll probably reach a CHECK_FOR_INTERRUPT()
somewhere higher in the stack and kill the connection almost as
abruptly anyway. (you can't violate the protocol, however)Yes, closing the blocked connection seems one of the most smarter
way, checking the occurred interrupt could avoid protocol
violation. But the problem for that is that there seems no means
to close sockets elsewhere the blocking handle. dup(2)'ed handle
cannot release the resource by only itself.
I didn't understand that, surely you can just close() the socket? There
is no dup(2) involved. And we don't necessarily need to close the
socket, we just need to avoid writing to it when we're already in the
middle of sending a message.
I'm marking this as Waiting on Author in the commitfest app, because:
1. the protocol violation needs to be avoided one way or another, and
2. the behavior needs to be consistent so that a single
pg_terminate_backend() is enough to always kill the connection.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
I condiered it but select() frequently (rather in most cases when
send() blocks by send buffer exhaustion) fails to predict that
following send() will be blocked. (If my memory is correct.) So
the final problem would be blocked send()...My point was to put the socket in non-blocking mode, so that send()
will return immediately with EAGAIN instead of blocking, if the send
buffer is full. See WalSndWriteData for how that would work, it does
something similar.
I confused it with what I did during writing this patch. select()
- blocking send(). Sorry for confusing the discussion. I
understand correctly what you mean and It sounds reasonable.
I agree it's pretty hard to define any reasonable timeout here. I
think it would be fine to just cut the connection; even if you don't
block while sending, you'll probably reach a CHECK_FOR_INTERRUPT()
somewhere higher in the stack and kill the connection almost as
abruptly anyway. (you can't violate the protocol, however)Yes, closing the blocked connection seems one of the most smarter
way, checking the occurred interrupt could avoid protocol
violation. But the problem for that is that there seems no means
to close sockets elsewhere the blocking handle. dup(2)'ed handle
cannot release the resource by only itself.I didn't understand that, surely you can just close() the socket?
There is no dup(2) involved. And we don't necessarily need to close
the socket, we just need to avoid writing to it when we're already in
the middle of sending a message.
My assumption there was select() and *blocking* send(). So the
sentence cited is out of point from the first.
I'm marking this as Waiting on Author in the commitfest app, because:
1. the protocol violation needs to be avoided one way or another, and
2. the behavior needs to be consistent so that a single
pg_terminate_backend() is enough to always kill the connection.
Thank you for the suggestion. I think I can go forward with that
and will come up with new patch.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello, sorry for the dazed reply in the previous mail.
I made revised patch for this issue.
Attached patches are following,
- 0001_Revise_socket_emulation_for_win32_backend.patch
Revises socket emulation on win32 backend so that each socket
can have its own blocking mode state.
- 0002_Allow_backend_termination_during_write_blocking.patch
The patch to solve the issue. This patch depends on the 0001_
patch.
==========
I'm marking this as Waiting on Author in the commitfest app, because:
1. the protocol violation needs to be avoided one way or another, and
2. the behavior needs to be consistent so that a single
pg_terminate_backend() is enough to always kill the connection.
- Preventing protocol violation.
To prevent protocol violation, secure_write sets
ClientConnectionLost when SIGTERM detected, then
internal_flush() and ProcessInterrupts() follow the
instruction.
- Single pg_terminate_backend surely kills the backend.
secure_raw_write() uses non-blocking socket and a loop of
select() with timeout to surely detects received
signal(SIGTERM).
To avoid frequent switching of blocking mode, the bare socket
for Port is put to non-blocking mode from the first in
StreamConnection() and blocking mode is controlled only by
Port->noblock in secure_raw_read/write().
To make the code mentioned above (Patch 0002) tidy, rewrite the
socket emulation code for win32 backends so that each socket
can have its own non-blocking state. (patch 0001)
Some concern about this patch,
- This patch allows the number of non-blocking socket to be below
64 (FD_SETSIZE) on win32 backend but it seems to be sufficient.
- This patch introduced redundant socket emulation for win32
backend but win32 bare socket for Port is already nonblocking
as described so it donsn't seem to be a serious problem on
performance. Addition to it, since I don't know the reason why
win32/socket.c provides the blocking-mode socket emulation, I
decided to preserve win32/socket.c to have blocking socket
emulation. Possibly it can be removed.
Any suggestions?
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:
To make the code mentioned above (Patch 0002) tidy, rewrite the
socket emulation code for win32 backends so that each socket
can have its own non-blocking state. (patch 0001)
The first patch that makes non-blocking sockets behave more sanely on
Windows seems like a good idea, independently of the second patch. I'm
looking at the first patch now, I'll make a separate post about the
second patch.
Some concern about this patch,
- This patch allows the number of non-blocking socket to be below
64 (FD_SETSIZE) on win32 backend but it seems to be sufficient.
Yeah, that's plenty.
- This patch introduced redundant socket emulation for win32
backend but win32 bare socket for Port is already nonblocking
as described so it donsn't seem to be a serious problem on
performance. Addition to it, since I don't know the reason why
win32/socket.c provides the blocking-mode socket emulation, I
decided to preserve win32/socket.c to have blocking socket
emulation. Possibly it can be removed.
On Windows, the backend has an emulation layer for POSIX signals, which
uses threads and Windows events. The reason win32/socket.c always uses
non-blocking mode internally is that it needs to wait for the socket to
become readable/writeable, and for the signal-emulation event, at the
same time. So no, we can't remove it.
The approach taken in the first patch seems sensible. I changed it to
not use FD_SET, though. A custom array seems better, that way we don't
need the pgwin32_nonblockset_init() call, we can just use initialize the
variable. It's a little bit more code, but it's well-contained in
win32/socket.c. Please take a look, to double-check that I didn't screw up.
- Heikki
Attachments:
improve-nonblocking-sockets-on-windows-1.patchtext/x-diff; name=improve-nonblocking-sockets-on-windows-1.patchDownload+97-37
On 08/28/2014 03:47 PM, Kyotaro HORIGUCHI wrote:
- Preventing protocol violation.
To prevent protocol violation, secure_write sets
ClientConnectionLost when SIGTERM detected, then
internal_flush() and ProcessInterrupts() follow the
instruction.
Oh, hang on. Now that I look at pqcomm.c more closely, it already has a
mechanism to avoid writing a message in the middle of another message.
See pq_putmessage and PqCommBusy. Can we rely on that?
- Single pg_terminate_backend surely kills the backend.
secure_raw_write() uses non-blocking socket and a loop of
select() with timeout to surely detects received
signal(SIGTERM).
I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
second increment, but I see that WaitLatchOrSocket() doesn't currently
support waiting for a socket to become writeable, without also waiting
for it to become readable. I wonder how difficult it would be to lift
that restriction.
I also wonder if it would be simpler to keep the socket in blocking mode
after all, and just close() in the signal handler if PqCommBusy == true.
If the signal arrives while sleeping in send(), the effect would be the
same as with your patch. If the signal arrives while sending, but not
sleeping, you would not get the chance to send the already-buffered data
to the client. But maybe that's OK, whether or not you're blocked is not
very deterministic anyway.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-09-02 21:46:29 +0300, Heikki Linnakangas wrote:
I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
second increment, but I see that WaitLatchOrSocket() doesn't currently
support waiting for a socket to become writeable, without also waiting for
it to become readable. I wonder how difficult it would be to lift that
restriction.
It's imo not that difficult. I've a prototype patch for that
somewhere. I tested my poll() implementation and it worked, but didn't
yet get to select().
I also wonder if it would be simpler to keep the socket in blocking mode
after all, and just close() in the signal handler if PqCommBusy == true. If
the signal arrives while sleeping in send(), the effect would be the same as
with your patch. If the signal arrives while sending, but not sleeping, you
would not get the chance to send the already-buffered data to the client.
But maybe that's OK, whether or not you're blocked is not very deterministic
anyway.
I've actually been working on a patch to make the whole interaction with
the client using sockets. The reason I started so is that we lots of far
to complex stuff in signal handlers, and using a latch would allow us to
instead interrupt send/recv. While still heavily WIP the reduction in
odd stuff (check e.g. HandleCatchupInterrupt()) made me rather happy.
I'm slightly worried about the added overhead due to the latch code. In
my implementation I only use latches after a nonblocking read, but
still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
that can be made problematic.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-09-02 21:01:44 +0200, Andres Freund wrote:
I've actually been working on a patch to make the whole interaction with
the client using sockets. The reason I started so is that we lots of far
to complex stuff in signal handlers, and using a latch would allow us to
instead interrupt send/recv. While still heavily WIP the reduction in
odd stuff (check e.g. HandleCatchupInterrupt()) made me rather happy.
Actually, the even more important reason is that that would allow us to
throw errors/fatals sanely in interrupts because we wouldn't possibly
jump through openssl anymore...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
second increment, but I see that WaitLatchOrSocket() doesn't currently
support waiting for a socket to become writeable, without also waiting
for it to become readable. I wonder how difficult it would be to lift
that restriction.
My recollection is that there was a reason for that, but I don't recall
details any more.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-09-02 17:21:03 -0400, Tom Lane wrote:
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
second increment, but I see that WaitLatchOrSocket() doesn't currently
support waiting for a socket to become writeable, without also waiting
for it to become readable. I wonder how difficult it would be to lift
that restriction.My recollection is that there was a reason for that, but I don't recall
details any more.
http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf
In my prototype I've changed the API that errors set both
READABLE/WRITABLE. Seems to work....
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/03/2014 12:23 AM, Andres Freund wrote:
On 2014-09-02 17:21:03 -0400, Tom Lane wrote:
Heikki Linnakangas <hlinnakangas@vmware.com> writes:
I was going to suggest using WaitLatchOrSocket instead of sleeping in 1
second increment, but I see that WaitLatchOrSocket() doesn't currently
support waiting for a socket to become writeable, without also waiting
for it to become readable. I wonder how difficult it would be to lift
that restriction.My recollection is that there was a reason for that, but I don't recall
details any more.http://git.postgresql.org/pg/commitdiff/e42a21b9e6c9b9e6346a34b62628d48ff2fc6ddf
In my prototype I've changed the API that errors set both
READABLE/WRITABLE. Seems to work....
Andres, would you mind posting the WIP patch you have? That could be a
better foundation for this patch.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I'm slightly worried about the added overhead due to the latch code. In
my implementation I only use latches after a nonblocking read, but
still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
that can be made problematic.
I think that's not the word you're looking for. Or if it is, then -
it's already problematic. At some point I hacked up a very crude
prototype that made LWLocks use latches to sleep instead of
semaphores. It was slow.
AIUI, the only reason why we need the self-pipe thing is because on
some platforms signals don't interrupt system calls. But my
impression was that those platforms were somewhat obscure. Could we
have a separate latch implementation for platforms where we know that
system calls will get interrupted by signals? Alternatively, should
we consider reimplementing latches using semaphores? I assume having
the signal handler up the semaphore would allow the attempt to down
the semaphore to succeed on return from the handler, so it would
accomplish the same thing as the self-pipe trick.
Basically, it doesn't feel like a good thing that we've got two sets
of primitives for making a backend wait that (1) don't really know
about each other and (2) use different operating system primitives.
Presumably one of the two systems is better; let's figure out which
one it is, use that one all the time, and get rid of the other one.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers