Windows buildfarm members vs. new async-notify isolation test
I see from the buildfarm status page that since commits 6b802cfc7
et al went in a week ago, frogmouth and currawong have failed that
new test case every time, with the symptom
================== pgsql.build/src/test/isolation/regression.diffs ===================
*** c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/expected/async-notify.out Mon Nov 25 00:30:49 2019
--- c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/results/async-notify.out Mon Dec 2 00:54:26 2019
***************
*** 93,99 ****
step llisten: LISTEN c1; LISTEN c2;
step lcommit: COMMIT;
step l2commit: COMMIT;
- listener2: NOTIFY "c1" with payload "" from notifier
step l2stop: UNLISTEN *;
starting permutation: llisten lbegin usage bignotify usage
--- 93,98 ----
(Note that these two critters don't run branches v11 and up, which
is why they're only showing this failure in 10 and 9.6.)
drongo showed the same failure once in v10, and fairywren showed
it once in v12. Every other buildfarm animal seems happy.
I'm a little baffled as to what this might be --- some sort of
timing problem in our Windows signal emulation, perhaps? But
if so, why haven't we found it years ago?
I don't have any ability to test this myself, so would appreciate
help or ideas.
regards, tom lane
On 12/2/19 11:23 AM, Tom Lane wrote:
I see from the buildfarm status page that since commits 6b802cfc7
et al went in a week ago, frogmouth and currawong have failed that
new test case every time, with the symptom================== pgsql.build/src/test/isolation/regression.diffs =================== *** c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/expected/async-notify.out Mon Nov 25 00:30:49 2019 --- c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/results/async-notify.out Mon Dec 2 00:54:26 2019 *************** *** 93,99 **** step llisten: LISTEN c1; LISTEN c2; step lcommit: COMMIT; step l2commit: COMMIT; - listener2: NOTIFY "c1" with payload "" from notifier step l2stop: UNLISTEN *;starting permutation: llisten lbegin usage bignotify usage --- 93,98 ----(Note that these two critters don't run branches v11 and up, which
is why they're only showing this failure in 10 and 9.6.)drongo showed the same failure once in v10, and fairywren showed
it once in v12. Every other buildfarm animal seems happy.I'm a little baffled as to what this might be --- some sort of
timing problem in our Windows signal emulation, perhaps? But
if so, why haven't we found it years ago?I don't have any ability to test this myself, so would appreciate
help or ideas.
I can test things, but I don't really know what to test. FYI frogmouth
and currawong run on virtualized XP. drongo anf fairywrne run on
virtualized WS2019. Neither VM is heavily resourced.
Note that my other windows critters jacana and bowerbird which run on
bare metal W10Pro haven't seen this problem.
(BTW, from next week I'm going to be down under for 3 months, and my
ability to test Windows things will be somewhat reduced.)
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 12/2/19 11:42 AM, Andrew Dunstan wrote:
On 12/2/19 11:23 AM, Tom Lane wrote:
I see from the buildfarm status page that since commits 6b802cfc7
et al went in a week ago, frogmouth and currawong have failed that
new test case every time, with the symptom================== pgsql.build/src/test/isolation/regression.diffs =================== *** c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/expected/async-notify.out Mon Nov 25 00:30:49 2019 --- c:/prog/bf/root/REL_10_STABLE/pgsql.build/src/test/isolation/results/async-notify.out Mon Dec 2 00:54:26 2019 *************** *** 93,99 **** step llisten: LISTEN c1; LISTEN c2; step lcommit: COMMIT; step l2commit: COMMIT; - listener2: NOTIFY "c1" with payload "" from notifier step l2stop: UNLISTEN *;starting permutation: llisten lbegin usage bignotify usage --- 93,98 ----(Note that these two critters don't run branches v11 and up, which
is why they're only showing this failure in 10 and 9.6.)drongo showed the same failure once in v10, and fairywren showed
it once in v12. Every other buildfarm animal seems happy.I'm a little baffled as to what this might be --- some sort of
timing problem in our Windows signal emulation, perhaps? But
if so, why haven't we found it years ago?I don't have any ability to test this myself, so would appreciate
help or ideas.I can test things, but I don't really know what to test. FYI frogmouth
and currawong run on virtualized XP. drongo anf fairywrne run on
virtualized WS2019. Neither VM is heavily resourced.
Hi Andrew, if you have time you could perhaps check the
isolation test structure itself. Like Tom, I don't have a
Windows box to test this.
I would be curious to see if there is a race condition in
src/test/isolation/isolationtester.c between the loop starting
on line 820:
while ((res = PQgetResult(conn)))
{
...
}
and the attempt to consume input that might include NOTIFY
messages on line 861:
PQconsumeInput(conn);
If the first loop consumes the commit message, gets no
further PGresult from PQgetResult, and finishes, and execution
proceeds to PQconsumeInput before the NOTIFY has arrived
over the socket, there won't be anything for PQnotifies to
return, and hence for try_complete_step to print before
returning.
I'm not sure if it is possible for the commit message to
arrive before the notify message in the fashion I am describing,
but that's something you might easily check by having
isolationtester sleep before PQconsumeInput on line 861.
--
Mark Dilger
Mark Dilger <hornschnorter@gmail.com> writes:
On 12/2/19 11:42 AM, Andrew Dunstan wrote:
On 12/2/19 11:23 AM, Tom Lane wrote:
I'm a little baffled as to what this might be --- some sort of
timing problem in our Windows signal emulation, perhaps? But
if so, why haven't we found it years ago?
I would be curious to see if there is a race condition in
src/test/isolation/isolationtester.c between the loop starting
on line 820:
while ((res = PQgetResult(conn)))
{
...
}
and the attempt to consume input that might include NOTIFY
messages on line 861:
PQconsumeInput(conn);
In principle, the issue should not be there, because commits
790026972 et al should have ensured that the NOTIFY protocol
message comes out before ReadyForQuery (and thus, libpq will
absorb it before PQgetResult will return NULL). I think the
timing problem --- if that's what it is --- must be on the
backend side; somehow the backend is not processing the
inbound notify queue before it goes idle.
Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?
That is,
/*
* Also process incoming notifies, if any. This is mostly to
* ensure stable behavior in tests: if any notifies were
* received during the just-finished transaction, they'll be
* seen by the client before ReadyForQuery is.
*/
+ CHECK_FOR_INTERRUPTS();
if (notifyInterruptPending)
ProcessNotifyInterrupt();
regards, tom lane
On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
In principle, the issue should not be there, because commits
790026972 et al should have ensured that the NOTIFY protocol
message comes out before ReadyForQuery (and thus, libpq will
absorb it before PQgetResult will return NULL). I think the
timing problem --- if that's what it is --- must be on the
backend side; somehow the backend is not processing the
inbound notify queue before it goes idle.Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?
That is,/*
* Also process incoming notifies, if any. This is mostly to
* ensure stable behavior in tests: if any notifies were
* received during the just-finished transaction, they'll be
* seen by the client before ReadyForQuery is.
*/
+ CHECK_FOR_INTERRUPTS();
if (notifyInterruptPending)
ProcessNotifyInterrupt();
I also tried to analyze this failure and it seems this is a good bet,
but I am also wondering why we have never seen such a timing issue in
other somewhat similar tests. For ex., one with comment (#
Cross-backend notification delivery.). Do they have a better way of
ensuring that the notification will be received or is it purely
coincidental that they haven't seen such a symptom?
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes:
On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?
I also tried to analyze this failure and it seems this is a good bet,
but I am also wondering why we have never seen such a timing issue in
other somewhat similar tests. For ex., one with comment (#
Cross-backend notification delivery.). Do they have a better way of
ensuring that the notification will be received or is it purely
coincidental that they haven't seen such a symptom?
TBH, my bet is that this *won't* fix it, but it seemed like an easy
thing to test. For this to fix it, you'd have to suppose that we
never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
improbable at best.
regards, tom lane
On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?I also tried to analyze this failure and it seems this is a good bet,
but I am also wondering why we have never seen such a timing issue in
other somewhat similar tests. For ex., one with comment (#
Cross-backend notification delivery.). Do they have a better way of
ensuring that the notification will be received or is it purely
coincidental that they haven't seen such a symptom?TBH, my bet is that this *won't* fix it, but it seemed like an easy
thing to test. For this to fix it, you'd have to suppose that we
never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
improbable at best.
You win your bet. Tried this on frogmouth and it still failed.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Dec 4, 2019 at 9:51 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?I also tried to analyze this failure and it seems this is a good bet,
but I am also wondering why we have never seen such a timing issue in
other somewhat similar tests. For ex., one with comment (#
Cross-backend notification delivery.). Do they have a better way of
ensuring that the notification will be received or is it purely
coincidental that they haven't seen such a symptom?TBH, my bet is that this *won't* fix it, but it seemed like an easy
thing to test. For this to fix it, you'd have to suppose that we
never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
improbable at best.You win your bet. Tried this on frogmouth and it still failed.
IIUC, this means that commit (step l2commit) is finishing before the
notify signal is reached that session. If so, can we at least confirm
that by adding something like select pg_sleep(1) in that step? So,
l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }. I
think we can try by increasing sleep time as well to confirm the
behavior if required.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 12/5/19 4:37 AM, Amit Kapila wrote:
On Wed, Dec 4, 2019 at 9:51 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:On Wed, Dec 4, 2019 at 12:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
On Tue, Dec 3, 2019 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hmm ... just looking at the code again, could it be that there's
no well-placed CHECK_FOR_INTERRUPTS? Andrew, could you see if
injecting one in what 790026972 added to postgres.c helps?I also tried to analyze this failure and it seems this is a good bet,
but I am also wondering why we have never seen such a timing issue in
other somewhat similar tests. For ex., one with comment (#
Cross-backend notification delivery.). Do they have a better way of
ensuring that the notification will be received or is it purely
coincidental that they haven't seen such a symptom?TBH, my bet is that this *won't* fix it, but it seemed like an easy
thing to test. For this to fix it, you'd have to suppose that we
never do a CHECK_FOR_INTERRUPTS during a COMMIT command, which is
improbable at best.You win your bet. Tried this on frogmouth and it still failed.
IIUC, this means that commit (step l2commit) is finishing before the
notify signal is reached that session. If so, can we at least confirm
that by adding something like select pg_sleep(1) in that step? So,
l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }. I
think we can try by increasing sleep time as well to confirm the
behavior if required.
Yeah, with the sleep in there the NOTIFY is seen.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
On 12/5/19 4:37 AM, Amit Kapila wrote:
IIUC, this means that commit (step l2commit) is finishing before the
notify signal is reached that session. If so, can we at least confirm
that by adding something like select pg_sleep(1) in that step? So,
l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }. I
think we can try by increasing sleep time as well to confirm the
behavior if required.
Yeah, with the sleep in there the NOTIFY is seen.
Well, that is *really* interesting, because I was fairly sure that
everything was adequately interlocked. The signal must have been
sent before step notify1 finishes, and then we do several other
things, so how could the listener2 process not have gotten it by
the time we run the l2commit step? I still think this is showing
us some sort of deficiency in our Windows signal mechanism.
A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.
(If true, even a very short sleep ought to be enough to fix the test.)
If that's what's happening, though, I think we ought to look into
whether we can raise the priority of the signal threads compared to
the main thread. I don't think we want this much variation between
the way signals work on Windows and the way they work elsewhere.
regards, tom lane
On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
On 12/5/19 4:37 AM, Amit Kapila wrote:
IIUC, this means that commit (step l2commit) is finishing before the
notify signal is reached that session. If so, can we at least confirm
that by adding something like select pg_sleep(1) in that step? So,
l2commit will be: step "l2commit" { SELECT pg_sleep(1); COMMIT; }. I
think we can try by increasing sleep time as well to confirm the
behavior if required.Yeah, with the sleep in there the NOTIFY is seen.
Well, that is *really* interesting, because I was fairly sure that
everything was adequately interlocked. The signal must have been
sent before step notify1 finishes, and then we do several other
things, so how could the listener2 process not have gotten it by
the time we run the l2commit step? I still think this is showing
us some sort of deficiency in our Windows signal mechanism.A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.
If we have to believe that theory then why the other similar test is
not showing the problem. The other test, I am talking about is below:
# Cross-backend notification delivery.
permutation "llisten" "notify1" "notify2" "notify3" "notifyf" "lcheck"
In this test also, one session is listening and the other session sent
a few notifications and then when the first session executes another
command (lcheck), it receives the notifications. I have also debugged
it in the Windows box that as soon as the notify sends the signal, the
signal thread receives it and comes out of ConnectNamedPipe and does
the processing to dispatch the signal.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes:
On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.
If we have to believe that theory then why the other similar test is
not showing the problem.
There are fewer processes involved in that case, so I don't think
it disproves the theory that this is a scheduler glitch.
I have also debugged
it in the Windows box that as soon as the notify sends the signal, the
signal thread receives it and comes out of ConnectNamedPipe and does
the processing to dispatch the signal.
Have you done that debugging on a machine that's showing the failure?
Since only some of our Windows critters are showing it, it's evidently
dependent on environment or Windows version somehow.
regards, tom lane
I wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.
If we have to believe that theory then why the other similar test is
not showing the problem.
There are fewer processes involved in that case, so I don't think
it disproves the theory that this is a scheduler glitch.
So, just idly looking at the code in src/backend/port/win32/signal.c
and src/port/kill.c, I have to wonder why we have this baroque-looking
design of using *two* signal management threads. And, if I'm
reading it right, we create an entire new pipe object and an entire
new instance of the second thread for each incoming signal. Plus, the
signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
which means they in effect wait for the recipient's signal-handling
thread to ack receipt of the signal. Maybe there's a good reason for
all this but it sure seems like a lot of wasted cycles from here.
I have to wonder why we don't have a single named pipe that lasts as
long as the recipient process does, and a signal sender just writes
one byte to it, and considers the signal delivered if it is able to
do that. The "message" semantics seem like overkill for that.
I dug around in the contemporaneous archives and could only find
/messages/by-id/303E00EBDD07B943924382E153890E5434AA47@cuthbert.rcsinc.local
which describes the existing approach but fails to explain why we
should do it like that.
This might or might not have much to do with the immediate problem,
but I can't help wondering if there's some race-condition-ish behavior
in there that's contributing to what we're seeing. We already had to
fix a couple of race conditions from doing it like this, cf commits
2e371183e, 04a4413c2, f27a4696f. Perhaps 0ea1f2a3a is relevant
as well.
regards, tom lane
On Sun, Dec 8, 2019 at 1:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
So, just idly looking at the code in src/backend/port/win32/signal.c
and src/port/kill.c, I have to wonder why we have this baroque-looking
design of using *two* signal management threads. And, if I'm
reading it right, we create an entire new pipe object and an entire
new instance of the second thread for each incoming signal. Plus, the
signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
which means they in effect wait for the recipient's signal-handling
thread to ack receipt of the signal. Maybe there's a good reason for
all this but it sure seems like a lot of wasted cycles from here.I have to wonder why we don't have a single named pipe that lasts as
long as the recipient process does, and a signal sender just writes
one byte to it, and considers the signal delivered if it is able to
do that. The "message" semantics seem like overkill for that.I dug around in the contemporaneous archives and could only find
/messages/by-id/303E00EBDD07B943924382E153890E5434AA47@cuthbert.rcsinc.local
which describes the existing approach but fails to explain why we
should do it like that.This might or might not have much to do with the immediate problem,
but I can't help wondering if there's some race-condition-ish behavior
in there that's contributing to what we're seeing.
On the receiving side, the work we do after the 'notify' is finished
(or before CallNamedPipe gets control back) is as follows:
pg_signal_dispatch_thread()
{
..
FlushFileBuffers(pipe);
DisconnectNamedPipe(pipe);
CloseHandle(pipe);
pg_queue_signal(sigNum);
}
It seems most of these are the system calls which makes me think that
they might be slow enough on some Windows version that it could lead
to such race condition.
Now, coming back to the other theory the scheduler is not able to
schedule these signal management threads. I think if that would be
the case, then notify could not have finished, because CallNamedPipe
returns only when dispatch thread writes back to the pipe. Now, if
somehow after writing back on the pipe if the scheduler kicks this
thread out, it is possible that we see such behavior, however, I am
not sure if we can do anything about that.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Sat, Dec 7, 2019 at 10:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
On Sat, Dec 7, 2019 at 5:01 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
A possible theory as to what's happening is that the kernel scheduler
is discriminating against listener2's signal management thread(s)
and not running them until everything else goes idle for a moment.If we have to believe that theory then why the other similar test is
not showing the problem.There are fewer processes involved in that case, so I don't think
it disproves the theory that this is a scheduler glitch.I have also debugged
it in the Windows box that as soon as the notify sends the signal, the
signal thread receives it and comes out of ConnectNamedPipe and does
the processing to dispatch the signal.Have you done that debugging on a machine that's showing the failure?
No, it is on my local Win-7 setup.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Sun, Dec 8, 2019 at 10:44 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Sun, Dec 8, 2019 at 1:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
So, just idly looking at the code in src/backend/port/win32/signal.c
and src/port/kill.c, I have to wonder why we have this baroque-looking
design of using *two* signal management threads. And, if I'm
reading it right, we create an entire new pipe object and an entire
new instance of the second thread for each incoming signal. Plus, the
signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
which means they in effect wait for the recipient's signal-handling
thread to ack receipt of the signal. Maybe there's a good reason for
all this but it sure seems like a lot of wasted cycles from here.I have to wonder why we don't have a single named pipe that lasts as
long as the recipient process does, and a signal sender just writes
one byte to it, and considers the signal delivered if it is able to
do that. The "message" semantics seem like overkill for that.I dug around in the contemporaneous archives and could only find
/messages/by-id/303E00EBDD07B943924382E153890E5434AA47@cuthbert.rcsinc.local
which describes the existing approach but fails to explain why we
should do it like that.This might or might not have much to do with the immediate problem,
but I can't help wondering if there's some race-condition-ish behavior
in there that's contributing to what we're seeing.On the receiving side, the work we do after the 'notify' is finished
(or before CallNamedPipe gets control back) is as follows:pg_signal_dispatch_thread()
{
..
FlushFileBuffers(pipe);
DisconnectNamedPipe(pipe);
CloseHandle(pipe);pg_queue_signal(sigNum);
}It seems most of these are the system calls which makes me think that
they might be slow enough on some Windows version that it could lead
to such race condition.
IIUC, once the dispatch thread has queued the signal
(pg_queue_signal), the next CHECK_FOR_INTERRUPTS by the main thread
will execute the signal. So, if we move pg_queue_signal() before we
do WriteFile in pg_signal_dispatch_thread(), this race condition will
be closed. Now, we might not want to do this as that will add some
more time (even though very less) before notify on the other side can
finish or maybe there is some technical problem with this idea which I
am not able to see immediately.
--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes:
IIUC, once the dispatch thread has queued the signal
(pg_queue_signal), the next CHECK_FOR_INTERRUPTS by the main thread
will execute the signal. So, if we move pg_queue_signal() before we
do WriteFile in pg_signal_dispatch_thread(), this race condition will
be closed. Now, we might not want to do this as that will add some
more time (even though very less) before notify on the other side can
finish or maybe there is some technical problem with this idea which I
am not able to see immediately.
Hmm. Certainly worth trying to see if it resolves the failure on
Andrew's machines.
It's not real hard to believe that TransactNamedPipe could be
"optimized" so that it preferentially schedules the client thread
once the handshake is done, not the server thread (based on some
heuristic claim that the former is probably an interactive process
and the latter less so). In that situation, we'd proceed on with
the signal not really delivered, and there is nothing guaranteeing
that it will be delivered anytime soon --- the rest of the test
can make progress regardless of whether that thread ever gets
scheduled again. So, as long as we've got this handshake mechanism,
it seems like it'd be a good thing for the ack to indicate that
the signal was *actually* delivered (by setting the recipient's
flag bit) and not just that it'll probably get delivered eventually.
I remain a bit unsure that we actually need the handshaking business
at all --- I doubt that Unix signals provide any guarantee of synchronous
delivery on most platforms. (If I'm reading the POSIX spec correctly,
it only requires synchronous delivery when a thread signals itself.)
But the existence of this unsynchronized thread in the Windows
implementation sure seems like a dubious thing, now that you
point it out.
regards, tom lane
I wrote:
Amit Kapila <amit.kapila16@gmail.com> writes:
IIUC, once the dispatch thread has queued the signal
(pg_queue_signal), the next CHECK_FOR_INTERRUPTS by the main thread
will execute the signal. So, if we move pg_queue_signal() before we
do WriteFile in pg_signal_dispatch_thread(), this race condition will
be closed. Now, we might not want to do this as that will add some
more time (even though very less) before notify on the other side can
finish or maybe there is some technical problem with this idea which I
am not able to see immediately.
Hmm. Certainly worth trying to see if it resolves the failure on
Andrew's machines.
For Andrew's convenience, here's a draft patch for that. I took the
liberty of improving the rather thin comments in this area, too.
regards, tom lane
Attachments:
this-might-fix-windows-race-condition.patchtext/x-diff; charset=us-ascii; name=this-might-fix-windows-race-condition.patchDownload+40-3
I wrote:
So, just idly looking at the code in src/backend/port/win32/signal.c
and src/port/kill.c, I have to wonder why we have this baroque-looking
design of using *two* signal management threads. And, if I'm
reading it right, we create an entire new pipe object and an entire
new instance of the second thread for each incoming signal. Plus, the
signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
which means they in effect wait for the recipient's signal-handling
thread to ack receipt of the signal. Maybe there's a good reason for
all this but it sure seems like a lot of wasted cycles from here.
Here's a possible patch (untested by me) to get rid of the second thread
and the new-pipe-for-every-request behavior. I believe that the existing
logic may be based on Microsoft's "Multithreaded Pipe Server" example [1]https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server
or something similar, but that's based on an assumption that servicing
a client request may take a substantial amount of time and it's worth
handling requests concurrently. Neither point applies in this context.
Doing it like this seems attractive to me because it gets rid of two
different failure modes: inability to create a new thread and inability
to create a new pipe handle. Now on the other hand, it means that
inability to complete the read/write transaction with a client right
away will delay processing of other signals. But we know that the
client is engaged in a CallNamedPipe operation, so how realistic is
that concern?
This is to be applied on top of the other patch I just sent.
regards, tom lane
[1]: https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server
Attachments:
use-only-one-signal-thread.patchtext/x-diff; charset=us-ascii; name=use-only-one-signal-thread.patchDownload+37-94
On 12/8/19 11:57 AM, Tom Lane wrote:
I wrote:
So, just idly looking at the code in src/backend/port/win32/signal.c
and src/port/kill.c, I have to wonder why we have this baroque-looking
design of using *two* signal management threads. And, if I'm
reading it right, we create an entire new pipe object and an entire
new instance of the second thread for each incoming signal. Plus, the
signal senders use CallNamedPipe (hence, underneath, TransactNamedPipe)
which means they in effect wait for the recipient's signal-handling
thread to ack receipt of the signal. Maybe there's a good reason for
all this but it sure seems like a lot of wasted cycles from here.Here's a possible patch (untested by me) to get rid of the second thread
and the new-pipe-for-every-request behavior. I believe that the existing
logic may be based on Microsoft's "Multithreaded Pipe Server" example [1]
or something similar, but that's based on an assumption that servicing
a client request may take a substantial amount of time and it's worth
handling requests concurrently. Neither point applies in this context.Doing it like this seems attractive to me because it gets rid of two
different failure modes: inability to create a new thread and inability
to create a new pipe handle. Now on the other hand, it means that
inability to complete the read/write transaction with a client right
away will delay processing of other signals. But we know that the
client is engaged in a CallNamedPipe operation, so how realistic is
that concern?This is to be applied on top of the other patch I just sent.
regards, tom lane
[1] https://docs.microsoft.com/en-us/windows/win32/ipc/multithreaded-pipe-server
Patch 1 fixed the problems on frogmouth.
Patch 2 also ran without incident.
tests run on REL_10_STABLE
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services