The real reason why TAP testing isn't ready for prime time

Started by Tom Lanealmost 11 years ago22 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Buildfarm member hamster has failed a pretty significant fraction of
its recent runs in the BinInstallCheck step:
http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamster&br=HEAD

Since other critters aren't equally distressed, it seems likely that
this is just an out-of-disk-space type of problem. But maybe it's
trying to tell us that there's a genuine platform-specific bug there.
In any case, I challenge anyone to figure out what's happening from
the information available from the buildfarm logs.

I don't know whether this is just that the buildfarm script isn't
collecting data that it should be. But my experiences with the
TAP test scripts haven't been very positive. When they fail, it
takes a lot of digging to find out why. Basically, that entire
mechanism sucks as far as debuggability is concerned.

I think there is a good argument for turning this off in the buildfarm
until there is a better way of identifying and solving problems. It is
not helping us that hamster is red half the time for undiscoverable
reasons. That just conditions people to ignore it, and it may well be
masking real problems that the machine could be finding if it weren't
failing at this step.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#1)
Re: The real reason why TAP testing isn't ready for prime time

On Mon, Jun 15, 2015 at 3:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Buildfarm member hamster has failed a pretty significant fraction of
its recent runs in the BinInstallCheck step:
http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamster&amp;br=HEAD

Since other critters aren't equally distressed, it seems likely that
this is just an out-of-disk-space type of problem. But maybe it's
trying to tell us that there's a genuine platform-specific bug there.
In any case, I challenge anyone to figure out what's happening from
the information available from the buildfarm logs.

I don't know whether this is just that the buildfarm script isn't
collecting data that it should be. But my experiences with the
TAP test scripts haven't been very positive. When they fail, it
takes a lot of digging to find out why. Basically, that entire
mechanism sucks as far as debuggability is concerned.

Indeed. I think that one step in the good direction would be to
replace all the calls to system and system_or_bail with a wrapper
routine that calls IPC::Run able to catch the logs and store those
logs in each test's base path. The same applies to pg_rewind tests.

I think there is a good argument for turning this off in the buildfarm
until there is a better way of identifying and solving problems. It is
not helping us that hamster is red half the time for undiscoverable
reasons. That just conditions people to ignore it, and it may well be
masking real problems that the machine could be finding if it weren't
failing at this step.

hamster is legendary slow and has a slow disc, hence it improves
chances of catching race conditions, and it is the only slow buildfarm
machine enabling the TAP tests (by comparison dangomushi has never
failed with the TAP tests) hence I would prefer thinking that the
problem is not specific to ArchLinux ARM. In this case the failure
seems to be related to the timing test servers stop and start even if
-w switch is used with pg_ctl, particularly that PGPORT is set to the
same value for all servers... Still, for the time being I don't mind
disabling them and just did so now. I will try to investigate further
on the machine itself.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#2)
Re: The real reason why TAP testing isn't ready for prime time

On Mon, Jun 15, 2015 at 8:26 AM, Michael Paquier wrote:

hamster is legendary slow and has a slow disc, hence it improves
chances of catching race conditions, and it is the only slow buildfarm
machine enabling the TAP tests (by comparison dangomushi has never
failed with the TAP tests) hence I would prefer thinking that the
problem is not specific to ArchLinux ARM. In this case the failure
seems to be related to the timing test servers stop and start even if
-w switch is used with pg_ctl, particularly that PGPORT is set to the
same value for all servers... Still, for the time being I don't mind
disabling them and just did so now. I will try to investigate further
on the machine itself.

So, I have been doing some progress on this thing.. And wrote the
patch attached that improves the way TAP tests log their output using
IPC::Run::run. This simply creates a log file as regress_log/test_name
for each test run, capturing all the command outputs. For example on
current HEAD, some commands of pg_ctl are simply silenced or have
their output redirected to /dev/null. With this patch, all their
output is redirected to the log file. This is as well compatible with
Windows, and logging of pg_rewind tests is unified with the new
structure of TestLib.pm. I am going to apply that to hamster and see
what is causing the error reported.

I think that it would be useful as well to improve the buildfarm
output. Thoughts?
Regards
--
Michael

Attachments:

20160618_tap_improve_logs.patchapplication/x-patch; name=20160618_tap_improve_logs.patchDownload+85-61
#4Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#3)
Re: The real reason why TAP testing isn't ready for prime time

On Thu, Jun 18, 2015 at 3:52 PM, Michael Paquier wrote:

I think that it would be useful as well to improve the buildfarm
output. Thoughts?

And after running the tests more or less 6~7 times in a row on a PI, I
have been able to trigger the problem and I think that I have found
its origin. First, the error has been triggered by the tests of
pg_rewind:
t/002_databases.pl ...
1..4
Bailout called. Further testing stopped: run pg_ctl failed: 256
Bail out! run pg_ctl failed: 256
FAILED--Further testing stopped: run pg_ctl failed: 256
Makefile:51: recipe for target 'check' failed
make[1]: *** [check] Error 255

And by looking at the logs obtained thanks to the previous patch I
could see the following (log attached for tests 1 and 2):
$ tail -n5 regress_log/regress_log_002_databases
waiting for server to start........ stopped waiting
pg_ctl: could not start server
Examine the log output.
LOG: received immediate shutdown request
LOG: received immediate shutdown request

pg_ctl should be able to start the server and should not fail here.
This is confirmed by the fact that first test has not stopped the
servers. On a clean run, the immediate shutdown request is received
and done:
waiting for server to shut down....LOG: received immediate shutdown request
LOG: unexpected EOF on standby connection
done

But in the case of the failure this does not happen:
LOG: received immediate shutdown request
LOG: unexpected EOF on standby connection
LOG: received immediate shutdown request
See the "done" is not here.

Now if we look at RewindTest.pm, there is the following code:
if ($test_master_datadir)
{
system
"pg_ctl -D $test_master_datadir -s -m immediate stop
2> /dev/null";
}
if ($test_standby_datadir)
{
system
"pg_ctl -D $test_standby_datadir -s -m immediate
stop 2> /dev/null";
}
And I think that the problem is triggered because we are missing a -w
switch here, meaning that we do not wait until the confirmation that
the server has stopped, and visibly if stop is slow enough the next
server to use cannot start because the port is already taken by the
server currently stopping.

Note as well that the last command of pg_ctl stop in
pg_ctl/t/002_status.pl does not use -w, so we have the same problem
there.

Attached is a patch fixing those problems and improving the log
facility as it really helped me out with those issues. The simplest
fix would be to include the -w switch missing in the tests of
pg_rewind and pg_ctl though.

It would be good to get that fixed, then I would be able to re-enable
the TAP tests of hamster. I have run the tests a dozen of times again
with this patch, and I could not trigger the failure anymore.
Regards,
--
Michael

Attachments:

regress_log_001_basicapplication/octet-stream; name=regress_log_001_basicDownload
regress_log_002_databasesapplication/octet-stream; name=regress_log_002_databasesDownload
20150619_tap_improve_logs_v2.patchapplication/x-patch; name=20150619_tap_improve_logs_v2.patchDownload+90-64
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#4)
Re: The real reason why TAP testing isn't ready for prime time

Michael Paquier <michael.paquier@gmail.com> writes:

Attached is a patch fixing those problems and improving the log
facility as it really helped me out with those issues. The simplest
fix would be to include the -w switch missing in the tests of
pg_rewind and pg_ctl though.

I agree with adding the -w switch and will go do that. As far as the
rest of this patch goes, it seems like it could be made less invasive
if the logs got dumped into a subdirectory of tmp_check rather than
adding another top-level directory that has to be cleaned?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#4)
Re: The real reason why TAP testing isn't ready for prime time

Michael Paquier <michael.paquier@gmail.com> writes:

Now if we look at RewindTest.pm, there is the following code:
if ($test_master_datadir)
{
system
"pg_ctl -D $test_master_datadir -s -m immediate stop
2> /dev/null";
}
if ($test_standby_datadir)
{
system
"pg_ctl -D $test_standby_datadir -s -m immediate
stop 2> /dev/null";
}
And I think that the problem is triggered because we are missing a -w
switch here, meaning that we do not wait until the confirmation that
the server has stopped, and visibly if stop is slow enough the next
server to use cannot start because the port is already taken by the
server currently stopping.

After I woke up a bit more, I remembered that -w is already the default
for "pg_ctl stop", so your diagnosis here is incorrect.

I suspect that the real problem is the arbitrary decision to use -m
immediate. The postmaster would ordinarily wait for its children to
die, but on a slow machine we could perhaps reach the end of that
5-second timeout, whereupon the postmaster would SIGKILL its children
*and exit immediately*. I'm not sure how instantaneous SIGKILL is,
but it seems possible that we could end up trying to start the new
postmaster before all the children of the old one are dead. If the
shmem interlock is working properly that ought to fail.

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: The real reason why TAP testing isn't ready for prime time

On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#7)
Re: The real reason why TAP testing isn't ready for prime time

On 2015-06-19 11:16:18 -0400, Robert Haas wrote:

On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

Seconded. It's pretty bad to possibly not be able to start again after
stopping it. I don't see the benefit in not waiting - sure, the
poastmaster exits faster, but postgres hasn't shut down at that point...

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#8)
Re: The real reason why TAP testing isn't ready for prime time

Andres Freund <andres@anarazel.de> writes:

On 2015-06-19 11:16:18 -0400, Robert Haas wrote:

On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

Seconded. It's pretty bad to possibly not be able to start again after
stopping it. I don't see the benefit in not waiting - sure, the
poastmaster exits faster, but postgres hasn't shut down at that point...

Yeah. I'll see about fixing this. Hard to be sure if it will fix
hamster's issue though.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: The real reason why TAP testing isn't ready for prime time

Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2015-06-19 11:16:18 -0400, Robert Haas wrote:

On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

Seconded. It's pretty bad to possibly not be able to start again after
stopping it. I don't see the benefit in not waiting - sure, the
poastmaster exits faster, but postgres hasn't shut down at that point...

Yeah. I'll see about fixing this. Hard to be sure if it will fix
hamster's issue though.

We discussed this when that patch got in (82233ce7ea42d6b). The reason
for not waiting, it was argued, is that the most likely reason for those
processes not to have already gone away by the time we send SIGKILL was
that they are stuck somewhere in the kernel, and so we might not be able
to actually get them to go away with the SIGKILL. As I recall, that was
the actual problem that MauMau was trying to get fixed.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#10)
Re: The real reason why TAP testing isn't ready for prime time

On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote:

We discussed this when that patch got in (82233ce7ea42d6b). The reason
for not waiting, it was argued, is that the most likely reason for those
processes not to have already gone away by the time we send SIGKILL was
that they are stuck somewhere in the kernel, and so we might not be able
to actually get them to go away with the SIGKILL. As I recall, that was
the actual problem that MauMau was trying to get fixed.

How does exiting before they're dead help? They're still going to be
attached to shared memeory and thus prevent a restart. I don't think
hiding the problem by exiting the postmaster helps at all.

I don't think it's our job to protect against processes stuck in the
kernel. If that happens something seriously has gone wrong, and we
shouldn't just continue without making that visible.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#10)
Re: The real reason why TAP testing isn't ready for prime time

Alvaro Herrera wrote:

We discussed this when that patch got in (82233ce7ea42d6b). The reason
for not waiting, it was argued, is that the most likely reason for those
processes not to have already gone away by the time we send SIGKILL was
that they are stuck somewhere in the kernel, and so we might not be able
to actually get them to go away with the SIGKILL. As I recall, that was
the actual problem that MauMau was trying to get fixed.

Hm, I reviewed the discussion and actually I'm wrong: I came up with
the unkillable process argument on my own. MauMau's problem was a
deadlocked process because of trying to malloc() on its way out.
I still think that unkillable processes are an issue, but perhaps it's
sane to have postmaster wait for some time instead of just going away
immediately.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#11)
Re: The real reason why TAP testing isn't ready for prime time

Andres Freund <andres@anarazel.de> writes:

On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote:

We discussed this when that patch got in (82233ce7ea42d6b). The reason
for not waiting, it was argued, is that the most likely reason for those
processes not to have already gone away by the time we send SIGKILL was
that they are stuck somewhere in the kernel, and so we might not be able
to actually get them to go away with the SIGKILL. As I recall, that was
the actual problem that MauMau was trying to get fixed.

How does exiting before they're dead help? They're still going to be
attached to shared memeory and thus prevent a restart. I don't think
hiding the problem by exiting the postmaster helps at all.

My thought exactly.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#13)
Re: The real reason why TAP testing isn't ready for prime time

Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2015-06-19 13:56:21 -0300, Alvaro Herrera wrote:

We discussed this when that patch got in (82233ce7ea42d6b). The reason
for not waiting, it was argued, is that the most likely reason for those
processes not to have already gone away by the time we send SIGKILL was
that they are stuck somewhere in the kernel, and so we might not be able
to actually get them to go away with the SIGKILL. As I recall, that was
the actual problem that MauMau was trying to get fixed.

How does exiting before they're dead help? They're still going to be
attached to shared memeory and thus prevent a restart. I don't think
hiding the problem by exiting the postmaster helps at all.

My thought exactly.

I guess you have a point there.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#6)
Re: The real reason why TAP testing isn't ready for prime time

On Sat, Jun 20, 2015 at 12:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

Now if we look at RewindTest.pm, there is the following code:
if ($test_master_datadir)
{
system
"pg_ctl -D $test_master_datadir -s -m immediate stop
2> /dev/null";
}
if ($test_standby_datadir)
{
system
"pg_ctl -D $test_standby_datadir -s -m immediate
stop 2> /dev/null";
}
And I think that the problem is triggered because we are missing a -w
switch here, meaning that we do not wait until the confirmation that
the server has stopped, and visibly if stop is slow enough the next
server to use cannot start because the port is already taken by the
server currently stopping.

After I woke up a bit more, I remembered that -w is already the default
for "pg_ctl stop", so your diagnosis here is incorrect.

Ah right. I forgot that. Perhaps I got just lucky in my runs.

I suspect that the real problem is the arbitrary decision to use -m
immediate. The postmaster would ordinarily wait for its children to
die, but on a slow machine we could perhaps reach the end of that
5-second timeout, whereupon the postmaster would SIGKILL its children
*and exit immediately*. I'm not sure how instantaneous SIGKILL is,
but it seems possible that we could end up trying to start the new
postmaster before all the children of the old one are dead. If the
shmem interlock is working properly that ought to fail.

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I don't think so as well.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#9)
Re: The real reason why TAP testing isn't ready for prime time

On Sat, Jun 20, 2015 at 12:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

On 2015-06-19 11:16:18 -0400, Robert Haas wrote:

On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

Seconded. It's pretty bad to possibly not be able to start again after
stopping it. I don't see the benefit in not waiting - sure, the
poastmaster exits faster, but postgres hasn't shut down at that point...

Yeah. I'll see about fixing this. Hard to be sure if it will fix
hamster's issue though.

I have re-enabled the TAP tests in hamster. What happens in the next
couple of days will be interesting to see.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#5)
Re: The real reason why TAP testing isn't ready for prime time

On Fri, Jun 19, 2015 at 11:45 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

Attached is a patch fixing those problems and improving the log
facility as it really helped me out with those issues. The simplest
fix would be to include the -w switch missing in the tests of
pg_rewind and pg_ctl though.

I agree with adding the -w switch and will go do that. As far as the
rest of this patch goes, it seems like it could be made less invasive
if the logs got dumped into a subdirectory of tmp_check rather than
adding another top-level directory that has to be cleaned?

tmp_check remains unused since dcae5fac (missing something perhaps?),
but I guess we could use it again to save the logs in it, and have
pg_rewind do the same.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#17)
Re: The real reason why TAP testing isn't ready for prime time

On Sat, Jun 20, 2015 at 7:06 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

As far as the
rest of this patch goes, it seems like it could be made less invasive
if the logs got dumped into a subdirectory of tmp_check rather than
adding another top-level directory that has to be cleaned?

tmp_check remains unused since dcae5fac (missing something perhaps?),
but I guess we could use it again to save the logs in it, and have
pg_rewind do the same.

OK, so here is an updated patch. All the logs are saved in
tmp_check/log, and each log file is named with the name of the file
that launched the test to facilitate its identification. I noticed at
the same time that some commands were called in silent mode. I removed
that and now what they output is written as well in the log files.
Regards,
--
Michael

Attachments:

20150620_tap_improve_logs_v3.patchapplication/x-patch; name=20150620_tap_improve_logs_v3.patchDownload+83-71
#19Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#16)
Re: The real reason why TAP testing isn't ready for prime time

On Sat, Jun 20, 2015 at 6:53 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

On Sat, Jun 20, 2015 at 12:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

On 2015-06-19 11:16:18 -0400, Robert Haas wrote:

On Fri, Jun 19, 2015 at 11:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wonder whether it's such a good idea for the postmaster to give
up waiting before all children are gone (postmaster.c:1722 in HEAD).

I doubt it.

Seconded. It's pretty bad to possibly not be able to start again after
stopping it. I don't see the benefit in not waiting - sure, the
poastmaster exits faster, but postgres hasn't shut down at that point...

Yeah. I'll see about fixing this. Hard to be sure if it will fix
hamster's issue though.

I have re-enabled the TAP tests in hamster. What happens in the next
couple of days will be interesting to see.he perl

And, we get a failure:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&amp;dt=2015-06-20%2017%3A59%3A01
I am not sure why buildfarm runs makes it more easily reproducible,
one of the reasons may be the perl scripts run underground.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#19)
Re: The real reason why TAP testing isn't ready for prime time

On Sun, Jun 21, 2015 at 7:06 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

And, we get a failure:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&amp;dt=2015-06-20%2017%3A59%3A01
I am not sure why buildfarm runs makes it more easily reproducible,
one of the reasons may be the perl scripts run underground.

hamster has not complained for a couple of weeks now, and the issue
was reproducible every 4~6 days:
http://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=hamster&amp;br=HEAD
Hence let's consider the issue as resolved.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#20)
#22Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#21)