MSVC odd TAP test problem
I have been working on enabling the remaining TAP tests on MSVC build in
the buildfarm client, but I have come across an odd problem. The bin
tests all run fine, but the recover tests crash and in such a way as to
crash the buildfarm client itself and require some manual cleanup. This
happens at some stage after the tests have run (the final "ok" is
output) but before the END handler in PostgresNode.pm (I put some traces
in there to see if I could narrow down where there were problems).
The symptom is that this appears at the end of the output when the
client calls "vcregress.pl taptest src/test/recover":
Terminating on signal SIGBREAK(21)
Terminating on signal SIGBREAK(21)
Terminate batch job (Y/N)?
And at that point there is nothing at all apparently running, according
to Sysinternals Process Explorer, including the buildfarm client.
It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
fix it.
Anyone have any clues?
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 7 May 2017 4:24 am, "Andrew Dunstan" <andrew.dunstan@2ndquadrant.com>
wrote:
I have been working on enabling the remaining TAP tests on MSVC build in
the buildfarm client, but I have come across an odd problem. The bin
tests all run fine, but the recover tests crash and in such a way as to
crash the buildfarm client itself and require some manual cleanup. This
happens at some stage after the tests have run (the final "ok" is
output) but before the END handler in PostgresNode.pm (I put some traces
in there to see if I could narrow down where there were problems).
The symptom is that this appears at the end of the output when the
client calls "vcregress.pl taptest src/test/recover":
Terminating on signal SIGBREAK(21)
Terminating on signal SIGBREAK(21)
Terminate batch job (Y/N)?
And at that point there is nothing at all apparently running, according
to Sysinternals Process Explorer, including the buildfarm client.
It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
fix it.
Anyone have any clues?
That looks like we've upset CMD.exe its self. I'm not sure how ... leaking
a signal to the parent proc?
I suspect this could be something to do with console process groups.
Bowerbird is win8 . So this isn't going to be related to the support for
ANSI escapes added in win10.
A serach for the error turns up a complaint about IPC::Run as the first
hit. Probably not coincidence.
http://stackoverflow.com/q/40924750
See this bug
On 05/06/2017 07:41 PM, Craig Ringer wrote:
On 7 May 2017 4:24 am, "Andrew Dunstan"
<andrew.dunstan@2ndquadrant.com
<mailto:andrew.dunstan@2ndquadrant.com>> wrote:I have been working on enabling the remaining TAP tests on MSVC
build in
the buildfarm client, but I have come across an odd problem. The bin
tests all run fine, but the recover tests crash and in such a way
as to
crash the buildfarm client itself and require some manual cleanup.
This
happens at some stage after the tests have run (the final "ok" is
output) but before the END handler in PostgresNode.pm (I put some
traces
in there to see if I could narrow down where there were problems).The symptom is that this appears at the end of the output when the
client calls "vcregress.pl <http://vcregress.pl> taptest
src/test/recover":Terminating on signal SIGBREAK(21)
Terminating on signal SIGBREAK(21)
Terminate batch job (Y/N)?And at that point there is nothing at all apparently running,
according
to Sysinternals Process Explorer, including the buildfarm client.It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
fix it.Anyone have any clues?
That looks like we've upset CMD.exe its self. I'm not sure how ...
leaking a signal to the parent proc?I suspect this could be something to do with console process groups.
Bowerbird is win8 . So this isn't going to be related to the support
for ANSI escapes added in win10.A serach for the error turns up a complaint about IPC::Run as the
first hit. Probably not coincidence.http://stackoverflow.com/q/40924750
See this bug
Actually, it's Win10, looks like I forgot to update the personality, my bad.
I had a feeling it was probably something to do with timeout. That RT
ticket looks like it's on the money.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/06/2017 08:54 PM, Andrew Dunstan wrote:
On 05/06/2017 07:41 PM, Craig Ringer wrote:
On 7 May 2017 4:24 am, "Andrew Dunstan"
<andrew.dunstan@2ndquadrant.com
<mailto:andrew.dunstan@2ndquadrant.com>> wrote:I have been working on enabling the remaining TAP tests on MSVC
build in
the buildfarm client, but I have come across an odd problem. The bin
tests all run fine, but the recover tests crash and in such a way
as to
crash the buildfarm client itself and require some manual cleanup.
This
happens at some stage after the tests have run (the final "ok" is
output) but before the END handler in PostgresNode.pm (I put some
traces
in there to see if I could narrow down where there were problems).The symptom is that this appears at the end of the output when the
client calls "vcregress.pl <http://vcregress.pl> taptest
src/test/recover":Terminating on signal SIGBREAK(21)
Terminating on signal SIGBREAK(21)
Terminate batch job (Y/N)?And at that point there is nothing at all apparently running,
according
to Sysinternals Process Explorer, including the buildfarm client.It's 100% repeatable on bowerbird, and I'm a bit puzzled about how to
fix it.Anyone have any clues?
That looks like we've upset CMD.exe its self. I'm not sure how ...
leaking a signal to the parent proc?I suspect this could be something to do with console process groups.
Bowerbird is win8 . So this isn't going to be related to the support
for ANSI escapes added in win10.A serach for the error turns up a complaint about IPC::Run as the
first hit. Probably not coincidence.http://stackoverflow.com/q/40924750
See this bug
Actually, it's Win10, looks like I forgot to update the personality, my bad.
I had a feeling it was probably something to do with timeout. That RT
ticket looks like it's on the money.
(After extensive trial and error) Turns out it's not quite that, it's
the kill_kill stuff. I think for now we should just disable it on the
platform. That means not running tests 7 and 8 of the logical_decoding
tests and all of the crash_recovery test. test::More has nice
faciliti4es for skipping tests cleanly. See attached patch.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
0001-Avoid-tests-which-crash-the-calling-process-on-Windo.patchtext/x-patch; name=0001-Avoid-tests-which-crash-the-calling-process-on-Windo.patchDownload+26-9
On Wed, May 10, 2017 at 2:11 AM, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
(After extensive trial and error) Turns out it's not quite that, it's
the kill_kill stuff. I think for now we should just disable it on the
platform. That means not running tests 7 and 8 of the logical_decoding
tests and all of the crash_recovery test. test::More has nice
faciliti4es for skipping tests cleanly. See attached patch.
+SKIP:
+{
+ # some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
+ skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32';
So this basically works with msys but not with MSWin32? Interesting...
Does it make a different if you use for example coup_d_grace =>
"QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.
+if ($Config{osname} eq 'MSWin32')
+{
+ # some Windows Perls at least don't like IPC::Run's start/kill_kill regime.
+ plan skip_all => "Test fails on Windows perl";
+}
Indentation is weird here, with a mix of spaces and tabs.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/09/2017 09:37 PM, Michael Paquier wrote:
On Wed, May 10, 2017 at 2:11 AM, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:(After extensive trial and error) Turns out it's not quite that, it's
the kill_kill stuff. I think for now we should just disable it on the
platform. That means not running tests 7 and 8 of the logical_decoding
tests and all of the crash_recovery test. test::More has nice
faciliti4es for skipping tests cleanly. See attached patch.+SKIP: +{ + # some Windows Perls at least don't like IPC::Run's start/kill_kill regime. + skip "Test fails on Windows perl", 2 if $Config{osname} eq 'MSWin32'; So this basically works with msys but not with MSWin32? Interesting...
On Msys we use the Msys DTK perl to run prove, and it executes the Msys
shell to run commands, with Msys signal emulation. The buildfarm client
goes to some trouble to arrange this. So it's very different.
Does it make a different if you use for example coup_d_grace =>
"QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.
No idea. I'll try.
+if ($Config{osname} eq 'MSWin32') +{ + # some Windows Perls at least don't like IPC::Run's start/kill_kill regime. + plan skip_all => "Test fails on Windows perl"; +} Indentation is weird here, with a mix of spaces and tabs.
I will indent it before I commit anything.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: 51c87fdf-16c0-6a4d-c949-b06f87052203@2ndQuadrant.comReference msg id not found: 51c87fdf-16c0-6a4d-c949-b06f87052203@2ndQuadrant.com | Resolved by subject fallback
On 05/10/2017 01:53 AM, Andrew Dunstan wrote:
Does it make a different if you use for example coup_d_grace =>
"QUIT"? Per the docs of IPC::Run SIGTERM is used for kills on Windows.No idea. I'll try.
This isn't going to work. If you look at the code in IPC/Run.pm you see
that the coup_d_grace signal is only used after it has first sent the
hardcoded SIGTERM. It might be tempting to play with using Sysinternals'
pskill utility, but we can hardly expect buildfarm owners and others to
hack their copies of IPC/Run.pm, so I'm going to go ahead and commit the
changes I proposed.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, May 11, 2017 at 7:29 AM, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:
This isn't going to work. If you look at the code in IPC/Run.pm you see
that the coup_d_grace signal is only used after it has first sent the
hardcoded SIGTERM. It might be tempting to play with using Sysinternals'
pskill utility, but we can hardly expect buildfarm owners and others to
hack their copies of IPC/Run.pm, so I'm going to go ahead and commit the
changes I proposed.
OK, thanks for checking. That was worth a try.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers