Buildfarm TAP testing is useless as currently implemented

Started by Tom Laneover 10 years ago7 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

I challenge anybody to figure out what happened here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2015-07-27%2010%3A25%3A17
or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2015-07-04%2016%3A00%3A23
or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2015-07-07%2016%3A35%3A06

With no visibility of pg_ctl's output, and no copy of the postmaster log,
there is no chance of debugging intermittent failures like this one.
This isn't entirely the buildfarm's fault --- AFAICS, prove-based testing
has inadequate error reporting by design. If "not ok" isn't enough
information for you, tough beans. (It might help if the farm script
captured the postmaster log after a failure, but that would do nothing
for prove's unwillingness to pass through client-side messages.)

I think we should disable TAP testing in the buildfarm until there is
some credible form of error reporting for it. I've grown tired of
looking into buildfarm failure reports only to meet a dead end.
Aside from the wasted investigation time, which admittedly isn't huge,
there's an opportunity cost in that subsequent test steps didn't get run.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#1)
Re: Buildfarm TAP testing is useless as currently implemented

On 07/27/2015 05:06 PM, Tom Lane wrote:

I challenge anybody to figure out what happened here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2015-07-27%2010%3A25%3A17
or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2015-07-04%2016%3A00%3A23
or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2015-07-07%2016%3A35%3A06

With no visibility of pg_ctl's output, and no copy of the postmaster log,
there is no chance of debugging intermittent failures like this one.
This isn't entirely the buildfarm's fault --- AFAICS, prove-based testing
has inadequate error reporting by design. If "not ok" isn't enough
information for you, tough beans. (It might help if the farm script
captured the postmaster log after a failure, but that would do nothing
for prove's unwillingness to pass through client-side messages.)

Yep.

I think we should disable TAP testing in the buildfarm until there is
some credible form of error reporting for it. I've grown tired of
looking into buildfarm failure reports only to meet a dead end.
Aside from the wasted investigation time, which admittedly isn't huge,
there's an opportunity cost in that subsequent test steps didn't get run.

Commit 1ea06203b - Improve logging of TAP tests - made it a lot better.
The pg_ctl log should be in the log file now. The buildfarm doesn't seem
to capture those logs at the moment, but that should be easy to fix.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#1)
Re: Buildfarm TAP testing is useless as currently implemented

On 07/27/2015 10:06 AM, Tom Lane wrote:

I challenge anybody to figure out what happened here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&dt=2015-07-27%2010%3A25%3A17
or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&dt=2015-07-04%2016%3A00%3A23
or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2015-07-07%2016%3A35%3A06

With no visibility of pg_ctl's output, and no copy of the postmaster log,
there is no chance of debugging intermittent failures like this one.
This isn't entirely the buildfarm's fault --- AFAICS, prove-based testing
has inadequate error reporting by design. If "not ok" isn't enough
information for you, tough beans. (It might help if the farm script
captured the postmaster log after a failure, but that would do nothing
for prove's unwillingness to pass through client-side messages.)

I think we should disable TAP testing in the buildfarm until there is
some credible form of error reporting for it. I've grown tired of
looking into buildfarm failure reports only to meet a dead end.
Aside from the wasted investigation time, which admittedly isn't huge,
there's an opportunity cost in that subsequent test steps didn't get run.

Well, it does create a lot of files that we don't pick up. An example
list is show below, and I am attaching their contents in a single
gzipped attachment. However, these are in the wrong location. This was a
vpath build and yet these tmp_check directories are all created in the
source tree. Let's fix that and then I'll set about having the buildfarm
collect them. That should get us further down the track.

cheers

andrew

/home/andrew/pgl/pg_head/src/bin/pg_controldata/tmp_check/log/regress_log_001_pg_controldata
/home/andrew/pgl/pg_head/src/bin/pg_basebackup/tmp_check/log/regress_log_020_pg_receivexlog
/home/andrew/pgl/pg_head/src/bin/pg_basebackup/tmp_check/log/regress_log_010_pg_basebackup
/home/andrew/pgl/pg_head/src/bin/pg_rewind/regress_log
/home/andrew/pgl/pg_head/src/bin/pg_rewind/tmp_check/log/regress_log_003_extrafiles
/home/andrew/pgl/pg_head/src/bin/pg_rewind/tmp_check/log/regress_log_001_basic
/home/andrew/pgl/pg_head/src/bin/pg_rewind/tmp_check/log/regress_log_002_databases
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_100_vacuumdb
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_091_reindexdb_all
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_050_dropdb
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_070_dropuser
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_020_createdb
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_102_vacuumdb_stages
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_030_createlang
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_060_droplang
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_040_createuser
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_010_clusterdb
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_011_clusterdb_all
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_101_vacuumdb_all
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_090_reindexdb
/home/andrew/pgl/pg_head/src/bin/scripts/tmp_check/log/regress_log_080_pg_isready
/home/andrew/pgl/pg_head/src/bin/pg_config/tmp_check/log/regress_log_001_pg_config
/home/andrew/pgl/pg_head/src/bin/pg_ctl/tmp_check/log/regress_log_001_start_stop
/home/andrew/pgl/pg_head/src/bin/pg_ctl/tmp_check/log/regress_log_002_status
/home/andrew/pgl/pg_head/src/bin/initdb/tmp_check/log/regress_log_001_initdb

Attachments:

binlog.gzapplication/x-gzip; name=binlog.gzDownload
#4Michael Paquier
michael@paquier.xyz
In reply to: Andrew Dunstan (#3)
Re: Buildfarm TAP testing is useless as currently implemented

On Tue, Jul 28, 2015 at 1:15 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

Well, it does create a lot of files that we don't pick up. An example list
is show below, and I am attaching their contents in a single gzipped
attachment. However, these are in the wrong location. This was a vpath build
and yet these tmp_check directories are all created in the source tree.
Let's fix that and then I'll set about having the buildfarm collect them.
That should get us further down the track.

[log list]

The patch attached fixes that. I suggest that we use env{TESTDIR}/log
as a location for the logs so as even a vpath build will locate
correctly the log files.
--
Michael

Attachments:

20150728_tap_logs_vpath.patchbinary/octet-stream; name=20150728_tap_logs_vpath.patchDownload+3-2
#5Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#3)
Re: Buildfarm TAP testing is useless as currently implemented

On 07/27/2015 12:15 PM, Andrew Dunstan wrote:

On 07/27/2015 10:06 AM, Tom Lane wrote:

I challenge anybody to figure out what happened here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hornet&amp;dt=2015-07-27%2010%3A25%3A17

or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=hamster&amp;dt=2015-07-04%2016%3A00%3A23

or here:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&amp;dt=2015-07-07%2016%3A35%3A06

With no visibility of pg_ctl's output, and no copy of the postmaster
log,
there is no chance of debugging intermittent failures like this one.
This isn't entirely the buildfarm's fault --- AFAICS, prove-based
testing
has inadequate error reporting by design. If "not ok" isn't enough
information for you, tough beans. (It might help if the farm script
captured the postmaster log after a failure, but that would do nothing
for prove's unwillingness to pass through client-side messages.)

I think we should disable TAP testing in the buildfarm until there is
some credible form of error reporting for it. I've grown tired of
looking into buildfarm failure reports only to meet a dead end.
Aside from the wasted investigation time, which admittedly isn't huge,
there's an opportunity cost in that subsequent test steps didn't get
run.

Well, it does create a lot of files that we don't pick up. An example
list is show below, and I am attaching their contents in a single
gzipped attachment. However, these are in the wrong location. This was
a vpath build and yet these tmp_check directories are all created in
the source tree. Let's fix that and then I'll set about having the
buildfarm collect them. That should get us further down the track.

The situation should now be substantially improved. This buildfarm
change
<https://github.com/PGBuildFarm/client-code/commit/e684baacf9cb9f9d821be5088b15b336dc6aae07&gt;
uses today's core changes to pick up log files. See
<http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=crake&amp;dt=2015-07-28%2023%3A08%3A54&amp;stg=bin-check&gt;
for an example.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#5)
Re: Buildfarm TAP testing is useless as currently implemented

Andrew Dunstan <andrew@dunslane.net> writes:

On 07/27/2015 10:06 AM, Tom Lane wrote:

I think we should disable TAP testing in the buildfarm until there is
some credible form of error reporting for it.

The situation should now be substantially improved.

Hm, I was just thinking we weren't there yet, because:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&amp;dt=2015-07-28%2023%3A03%3A39

This buildfarm change
<https://github.com/PGBuildFarm/client-code/commit/e684baacf9cb9f9d821be5088b15b336dc6aae07&gt;
uses today's core changes to pick up log files.

Ah, so we need a new buildfarm script release?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#6)
Re: Buildfarm TAP testing is useless as currently implemented

On 07/28/2015 08:58 PM, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 07/27/2015 10:06 AM, Tom Lane wrote:

I think we should disable TAP testing in the buildfarm until there is
some credible form of error reporting for it.

The situation should now be substantially improved.

Hm, I was just thinking we weren't there yet, because:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=axolotl&amp;dt=2015-07-28%2023%3A03%3A39

This buildfarm change
<https://github.com/PGBuildFarm/client-code/commit/e684baacf9cb9f9d821be5088b15b336dc6aae07&gt;
uses today's core changes to pick up log files.

Ah, so we need a new buildfarm script release?

Yeah. I'll push one in the next couple of days.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers