Time to change pg_regress diffs to unified by default?

Started by Andres Freundabout 9 years ago23 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.
E.g. in
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=prairiedog&dt=2017-04-06%2021%3A10%3A56&stg=check
the salient point (ERROR: 50 is outside the valid range for parameter "effective_io_concurrency" (0 .. 0))
is 130 lines into the diff, whereas it's right at the start in a unified diff.
Issues with one error that causes a lot of followup output changes are
really common in our regression suite.

I personally have PG_REGRESS_DIFF_OPTS set to -dU10, but that doesn't
help much analyzing buildfarm output.

Therefore I propose changing the defaults in pg_regress.c.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#1)
Re: Time to change pg_regress diffs to unified by default?

On Fri, Apr 7, 2017 at 10:31 AM, Andres Freund <andres@anarazel.de> wrote:

Hi,

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.
E.g. in
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=prairiedog&amp;dt=2017-04-06%2021%3A10%3A56&amp;stg=check
the salient point (ERROR: 50 is outside the valid range for parameter "effective_io_concurrency" (0 .. 0))
is 130 lines into the diff, whereas it's right at the start in a unified diff.
Issues with one error that causes a lot of followup output changes are
really common in our regression suite.

I personally have PG_REGRESS_DIFF_OPTS set to -dU10, but that doesn't
help much analyzing buildfarm output.

Therefore I propose changing the defaults in pg_regress.c.

+1

So much easier to read.

--
Thomas Munro
http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#1)
Re: Time to change pg_regress diffs to unified by default?

On 04/06/2017 06:31 PM, Andres Freund wrote:

Hi,

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.
E.g. in
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=prairiedog&amp;dt=2017-04-06%2021%3A10%3A56&amp;stg=check
the salient point (ERROR: 50 is outside the valid range for parameter "effective_io_concurrency" (0 .. 0))
is 130 lines into the diff, whereas it's right at the start in a unified diff.
Issues with one error that causes a lot of followup output changes are
really common in our regression suite.

I personally have PG_REGRESS_DIFF_OPTS set to -dU10, but that doesn't
help much analyzing buildfarm output.

Therefore I propose changing the defaults in pg_regress.c.

+1

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#1)
Re: Time to change pg_regress diffs to unified by default?

Andres Freund <andres@anarazel.de> writes:

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.

Personally, I disagree completely. Unified diffs are utterly unreadable
for anything beyond trivial cases of small well-separated changes.

It's possible that regression failure diffs will usually fall into that
category, but I'm not convinced.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#1)
Re: Time to change pg_regress diffs to unified by default?

On 4/6/17 18:31, Andres Freund wrote:

I personally have PG_REGRESS_DIFF_OPTS set to -dU10, but that doesn't
help much analyzing buildfarm output.

Therefore I propose changing the defaults in pg_regress.c.

I think one problem is that diff -u is not as portable as diff -c. For
example, the HP-UX 11 man page of diff doesn't list it.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#5)
Re: Time to change pg_regress diffs to unified by default?

On 04/06/2017 09:17 PM, Peter Eisentraut wrote:

On 4/6/17 18:31, Andres Freund wrote:

I personally have PG_REGRESS_DIFF_OPTS set to -dU10, but that doesn't
help much analyzing buildfarm output.

Therefore I propose changing the defaults in pg_regress.c.

I think one problem is that diff -u is not as portable as diff -c. For
example, the HP-UX 11 man page of diff doesn't list it.

Ugh. I suppose we could run a test to see if it was available. If it
comes to that, I guess I could do a similar test in the buildfarm
client. Seems a bit like overkill, though.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7David Rowley
dgrowleyml@gmail.com
In reply to: Andres Freund (#1)
Re: Time to change pg_regress diffs to unified by default?

On 7 April 2017 at 10:31, Andres Freund <andres@anarazel.de> wrote:

Hi,

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.
E.g. in
https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=prairiedog&amp;dt=2017-04-06%2021%3A10%3A56&amp;stg=check
the salient point (ERROR: 50 is outside the valid range for parameter "effective_io_concurrency" (0 .. 0))
is 130 lines into the diff, whereas it's right at the start in a unified diff.
Issues with one error that causes a lot of followup output changes are
really common in our regression suite.

I personally have PG_REGRESS_DIFF_OPTS set to -dU10, but that doesn't
help much analyzing buildfarm output.

Therefore I propose changing the defaults in pg_regress.c.

You mean people actually use those diffs? I've never done anything
apart from using <my favourite text comparison tool>. That way I can
reject and accept the changes as I wish, just by kicking the results
over to the expected results, or not, if there's a genuine mistake.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#4)
Re: Time to change pg_regress diffs to unified by default?

On Thu, Apr 06, 2017 at 07:01:32PM -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.

Personally, I disagree completely. Unified diffs are utterly unreadable
for anything beyond trivial cases of small well-separated changes.

It's possible that regression failure diffs will usually fall into that
category, but I'm not convinced.

For reading patches, I frequently use both formats. Overall, I perhaps read
unified 3/4 of the time and context 1/4 of the time.

For regression diffs, I use PG_REGRESS_DIFF_OPTS=-u and have never converted a
regression diff to context form. Hence, +1 for the proposed change.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
Re: Time to change pg_regress diffs to unified by default?

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

I think one problem is that diff -u is not as portable as diff -c. For
example, the HP-UX 11 man page of diff doesn't list it.

FWIW, I can confirm that HPUX 10.20's diff hasn't got it. That would
not affect gaur/pademelon, if we make this change, because I installed
GNU diffutils on that machine a decade or two ago. It might be a bigger
issue for the other HPUX critters though.

Some other data points:

* POSIX 2008 requires diff -u.
* SUS v2 (POSIX 1997) does not.
* My other pet dinosaur, prairiedog (macOS 10.4 something), has it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Christoph Berg
myon@debian.org
In reply to: Noah Misch (#8)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

Re: Noah Misch 2017-04-07 <20170407021431.GB2658646@tornado.leadboat.com>

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.

Personally, I disagree completely. Unified diffs are utterly unreadable
for anything beyond trivial cases of small well-separated changes.

It's possible that regression failure diffs will usually fall into that
category, but I'm not convinced.

For reading patches, I frequently use both formats. Overall, I perhaps read
unified 3/4 of the time and context 1/4 of the time.

For regression diffs, I use PG_REGRESS_DIFF_OPTS=-u and have never converted a
regression diff to context form. Hence, +1 for the proposed change.

I've just had another case of horrible context diff from pg_regress.
I'd claim that regression diffs are particularly bad for context diffs
because one error will often trigger a whole chain of failures, which
will essentially make the whole file appear twice in the output,
whereas unified diffs would just put the original output and the error
right next to each other at the top. Having to scroll through a whole
.out file just to find "create extension; file not found" is very
inefficient.

It's nice that PG_REGRESS_DIFF_OPTS exists, but the diffs are often
coming from automated build logs where setting the variable after the
fact doesn't help.

Please consider the attached patch, extension packagers will thank
you.

Christoph

Attachments:

0001-Switch-pg_regress-to-output-unified-diffs-by-default.patchtext/x-diff; charset=us-asciiDownload+3-4
#11David Fetter
david@fetter.org
In reply to: Christoph Berg (#10)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

On Thu, Nov 22, 2018 at 02:10:59PM +0100, Christoph Berg wrote:

Re: Noah Misch 2017-04-07 <20170407021431.GB2658646@tornado.leadboat.com>

I personally, and I know of a bunch of other regular contributors, find
context diffs very hard to read. Besides general dislike, for things
like regression test output context diffs are just not well suited.

Personally, I disagree completely. Unified diffs are utterly unreadable
for anything beyond trivial cases of small well-separated changes.

It's possible that regression failure diffs will usually fall into that
category, but I'm not convinced.

For reading patches, I frequently use both formats. Overall, I perhaps read
unified 3/4 of the time and context 1/4 of the time.

For regression diffs, I use PG_REGRESS_DIFF_OPTS=-u and have never converted a
regression diff to context form. Hence, +1 for the proposed change.

I've just had another case of horrible context diff from pg_regress.
I'd claim that regression diffs are particularly bad for context diffs
because one error will often trigger a whole chain of failures, which
will essentially make the whole file appear twice in the output,
whereas unified diffs would just put the original output and the error
right next to each other at the top. Having to scroll through a whole
.out file just to find "create extension; file not found" is very
inefficient.

It's nice that PG_REGRESS_DIFF_OPTS exists, but the diffs are often
coming from automated build logs where setting the variable after the
fact doesn't help.

Please consider the attached patch, extension packagers will thank
you.

Christoph

+1 for pushing this. Regression diffs can get pretty noisy even with
it in place.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#12Peter Eisentraut
peter_e@gmx.net
In reply to: Christoph Berg (#10)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

On 22/11/2018 14:10, Christoph Berg wrote:

It's nice that PG_REGRESS_DIFF_OPTS exists, but the diffs are often
coming from automated build logs where setting the variable after the
fact doesn't help.

Please consider the attached patch, extension packagers will thank
you.

Committed.

While we're considering the pg_regress output, what do you think about
replacing the ======... separator with a standard diff separator like
"diff %s %s %s\n". This would make the file behave more like a proper
diff file, for use with other tools. And it shows the diff options
used, for clarity. See attached patch.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Use-standard-diff-separator-for-regression.diffs.patchtext/plain; charset=UTF-8; name=0001-Use-standard-diff-separator-for-regression.diffs.patch; x-mac-creator=0; x-mac-type=0Download+9-7
#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#12)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

While we're considering the pg_regress output, what do you think about
replacing the ======... separator with a standard diff separator like
"diff %s %s %s\n". This would make the file behave more like a proper
diff file, for use with other tools. And it shows the diff options
used, for clarity. See attached patch.

I'm confused by this patch. Doesn't moving the diff call like that
break the logic completely?

regards, tom lane

#14Christoph Berg
myon@debian.org
In reply to: Peter Eisentraut (#12)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

Re: Peter Eisentraut 2019-01-02 <70440c81-37bb-76dd-e48b-b5a9550d5613@2ndquadrant.com>

Committed.

\o/

While we're considering the pg_regress output, what do you think about
replacing the ======... separator with a standard diff separator like
"diff %s %s %s\n". This would make the file behave more like a proper
diff file, for use with other tools. And it shows the diff options
used, for clarity. See attached patch.

It will especially say which _alternate.out file was used, which seems
like a big win. So +1.

Christoph

#15Daniel Gustafsson
daniel@yesql.se
In reply to: Christoph Berg (#14)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

On 3 Jan 2019, at 10:39, Christoph Berg <myon@debian.org> wrote:

Re: Peter Eisentraut 2019-01-02 <70440c81-37bb-76dd-e48b-b5a9550d5613@2ndquadrant.com>

While we're considering the pg_regress output, what do you think about
replacing the ======... separator with a standard diff separator like
"diff %s %s %s\n". This would make the file behave more like a proper
diff file, for use with other tools. And it shows the diff options
used, for clarity. See attached patch.

It will especially say which _alternate.out file was used, which seems
like a big win. So +1.

That has bitten more times than I’d like to admit, so definately +1 on being
explicit about that.

cheers ./daniel

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#13)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

On 02/01/2019 21:44, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

While we're considering the pg_regress output, what do you think about
replacing the ======... separator with a standard diff separator like
"diff %s %s %s\n". This would make the file behave more like a proper
diff file, for use with other tools. And it shows the diff options
used, for clarity. See attached patch.

I'm confused by this patch. Doesn't moving the diff call like that
break the logic completely?

For clarification, I have attached a "before" and "after".

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

regression.diffs.newtext/plain; charset=UTF-8; name=regression.diffs.new; x-mac-creator=0; x-mac-type=0Download+2-0
regression.diffs.oldtext/plain; charset=UTF-8; name=regression.diffs.old; x-mac-creator=0; x-mac-type=0Download+2-0
#17Peter Eisentraut
peter_e@gmx.net
In reply to: Christoph Berg (#14)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

On 03/01/2019 10:39, Christoph Berg wrote:

It will especially say which _alternate.out file was used, which seems
like a big win. So +1.

It already shows that in the existing diff output header.

Although if you have a really long absolute path, it might be hard to
find it. So perhaps the attached patch to make it more readable.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-pg_regress-Don-t-use-absolute-paths-for-the-diff.patchtext/plain; charset=UTF-8; name=0001-pg_regress-Don-t-use-absolute-paths-for-the-diff.patch; x-mac-creator=0; x-mac-type=0Download+5-10
#18Peter Eisentraut
peter_e@gmx.net
In reply to: Daniel Gustafsson (#15)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

On 2019-01-03 11:20, Daniel Gustafsson wrote:

On 3 Jan 2019, at 10:39, Christoph Berg <myon@debian.org> wrote:

Re: Peter Eisentraut 2019-01-02 <70440c81-37bb-76dd-e48b-b5a9550d5613@2ndquadrant.com>

While we're considering the pg_regress output, what do you think about
replacing the ======... separator with a standard diff separator like
"diff %s %s %s\n". This would make the file behave more like a proper
diff file, for use with other tools. And it shows the diff options
used, for clarity. See attached patch.

It will especially say which _alternate.out file was used, which seems
like a big win. So +1.

That has bitten more times than I’d like to admit, so definately +1 on being
explicit about that.

committed

(This might be one of those rare times where one hopes for a buildfarm
failure for verification. :-) )

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#19Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Peter Eisentraut (#18)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

"Peter" == Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

Peter> (This might be one of those rare times where one hopes for a
Peter> buildfarm failure for verification. :-) )

Also while we're tweaking regression test output, would it be possible
to have some indicator of whether a test passed because a variant file
in the resultmap was ignored in favor of the standard result?

The current system of silently ignoring the resultmap means that nobody
ever notices when resultmap entries become obsolete....

--
Andrew (irc:RhodiumToad)

#20Christoph Berg
myon@debian.org
In reply to: Andrew Gierth (#19)
Re: [HACKERS] Time to change pg_regress diffs to unified by default?

Re: Andrew Gierth 2019-02-15 <874l95m8w7.fsf@news-spur.riddles.org.uk>

Also while we're tweaking regression test output, would it be possible
to have some indicator of whether a test passed because a variant file
in the resultmap was ignored in favor of the standard result?

The current system of silently ignoring the resultmap means that nobody
ever notices when resultmap entries become obsolete....

By the same argument, it should always print which variant file was
used so determining which _N.out files are still in use is possible.

Christoph

#21Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#17)
#22Peter Eisentraut
peter_e@gmx.net
In reply to: Christoph Berg (#20)
#23Christoph Berg
myon@debian.org
In reply to: Peter Eisentraut (#22)