Re: new tests post-feature freeze (was pgsql: Add TAP tests for pg_dump)

Started by Robert Haasover 9 years ago7 messages
#1Robert Haas
robertmhaas@gmail.com

On Sat, May 7, 2016 at 5:44 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:

On 5/7/16 9:36 AM, Stephen Frost wrote:

Honestly, over the next couple of months between feature-freeze and
release, I'd like to add even more tests, and not just to pg_dump but
also to other commands that don't have very good testing today (psql, in
particular, but pg_dumpall needs more also, and there's more to do with
pg_dump too).

If we're going to do that, there will be no stopping it. I also
have a bunch of code in this area lined up, but I was hoping to
submit it to the commit fest process at the right time and order to
get feedback and testing. If we're going to allow new tests being
developed right up until release, I can just stop doing release
preparation work right now and get back to coding and reviewing.

I do think that now is a good time for people to be reviewing what's
been committed, which includes writing tests (either automated ones,
which can be included in our test suite, or one-off's for testing
specific things but which don't make sense to include).

That doesn't mean we should be codeing or reviewing new features for
commit.

As for release prep, I certainly applaud everyone involved in that
effort as we do have the beta release and back-branch releases coming
up.

Regarding when we should stop adding new tests, I can't agree with the
notion that they should be treated as new features. New tests may break
the buildfarm, but they do not impact the stability of committed code,
nor do they introduce new bugs into the code which has been committed
(if anything, they expose committed bugs, as the set of tests we're
discussing did, which should then be fixed).

Ultimately, tests are code and should be treated as such.

Perhaps in some manners this is accurate, but I'd view our test suite as
an independent code base from PG. Bugs in the test suite might cause
false test failures or other issues on the buildfarm or during
packaging, but bugs or issues in the test suite won't cause data loss,
data corruption, or generally impact running operations of our users.

I can see some sense in holding off on adding more tests once we hit RC,
as we want anything done with RC to be essentially identical to the
release, unless there is a serious issue, but holding off on adding new
tests which could expose issues in committed code for the months during
beta doesn't seem sensible.

As such, I'd certainly encourage you to propose new tests, even now, but
not changes to the PG code, except for bug fixes, or changes agreed to
by the RMT. How you prioritise that with the release preparation work
is up to you, of course, though if I were in your shoes, I'd take care
of the release prep first, as we're quite close to doing a set of
releases. I'm happy to try and help with that effort myself, though
I've not been involved very much in release prep and am not entirely
sure what I can do to help.

In the end, the question of the degree to which tests constitute
features is one that needs to be made by the whole community, not
individual developers. I think you both raise good points. On the one
hand, developing new test frameworks could distract from other release
preparation tasks, as Peter rightly points out. On the other hand, it
could also make the release more reliable, as Stephen points out. I
believe that there is a clear consensus that at least some new tests
are welcome even post-feature freeze. However, I also believe that we
could get carried away with that, and end up having it become a
distraction from the task of getting the release out the door.

Also, if we say that new tests are not features, that would mean that
they could be back-patched even after the release is out the door, and
generally I'm not in favor of that policy, except when we're adding a
test to a back-branch that is closely tied to a bug we are fixing in
that branch. I do not want to see the pg_dump test suite back-patched
to all active branches, for example, even though I approve of having
test coverage for pg_dump. I am confident that minimizing churn in
the back-branches is a more important goal than ensuring test coverage
for those branches, and that we will regret it if we reverse those
priorities.

My suggestion is that, from this point forward, we add new tests to
9.6 only if they are closely related to a bug that is getting fixed or
a feature that is new in 9.6. I think that's a reasonable compromise,
but what do others think?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#1)

Robert Haas <robertmhaas@gmail.com> writes:

My suggestion is that, from this point forward, we add new tests to
9.6 only if they are closely related to a bug that is getting fixed or
a feature that is new in 9.6. I think that's a reasonable compromise,
but what do others think?

Yeah, that's fair. I suspect that Peter's unhappiness with these tests
is partly rooted in the fact that they're just generic pg_dump tests
and not connected to any new-in-9.6 behavior. As such, it's not entirely
clear why we should be taking any stability risk by pushing them in
so late. We seem to have gotten away with it this time --- but it would
only have taken one more problem to get me to switch my vote to "revert
them".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#1)

* Robert Haas (robertmhaas@gmail.com) wrote:

Also, if we say that new tests are not features, that would mean that
they could be back-patched even after the release is out the door, and
generally I'm not in favor of that policy, except when we're adding a
test to a back-branch that is closely tied to a bug we are fixing in
that branch. I do not want to see the pg_dump test suite back-patched
to all active branches, for example, even though I approve of having
test coverage for pg_dump. I am confident that minimizing churn in
the back-branches is a more important goal than ensuring test coverage
for those branches, and that we will regret it if we reverse those
priorities.

I agree that it doesn't make sense to back-patch large test suites which
are primairly intended to provide code-coverage testing. In many cases,
that would simply be duplicative without much gain as the code hasn't
changed and tests would have to be removed due to features which don't
exist in older branches, and there is certainly a risk there that it
could complicate things for organizations which run the test suites and
for packagers unnecessairly.

My suggestion is that, from this point forward, we add new tests to
9.6 only if they are closely related to a bug that is getting fixed or
a feature that is new in 9.6. I think that's a reasonable compromise,
but what do others think?

I'm willing to accept that compromise, but I'm not thrilled with it due
to what it will mean for the process I'm currently going through. The
approach I've been using has been to add tests to gain more code
coverage of the code in pg_dump. That has turned up multiple
pre-existing bugs in pg_dump but the vast majority of the tests come
back clean. This compromise would mean that I'd continue to work
through the code coverage tests, but would have to segregate out and
commit only those tests which actually reveal bugs, once those bugs have
been fixed (as to avoid turning the buildfarm red). The rest of the
tests would still get written, but since they currently don't reveal
bugs, they would be shelved until development is opened for 9.7.

Thinking further on this, I'd probably end up creating a buildfarm
animal which only reports to me which runs the full set of tests, so
that any regression which occurs that the tests catch during the beta
period is caught.

Unfortunately, I'm not able to only work on and write tests only for
bugs, as I don't know what the bugs are. Writing new tests using the
test suite isn't really any more work than doing one-off testing, but
it has lasting value that one-off testing doesn't. I don't mean to say
"I accept the compromise but will do my own thing anyway" but rather to
point out that this is an efficient and worthwhile way to find bugs and
that's what I'm planning to work on to help ensure that we're ready to
release.

I'm happy to look at reviewing other committed patches also, but I've
had my head in pg_dump for the past few months and have a pretty good
handle on it, for the moment anyway, and there have certainly been lots
of complaints over the years about our lack of test coverage for it.
Were I to look at reviewing and testing other patches, well, at the
moment I'd be pretty inclined to write test cases for those too, as it
seems to be working to find issues.

Thanks!

Stephen

#4Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Stephen Frost (#3)
Re: Re: new tests post-feature freeze (was pgsql: Add TAP tests for pg_dump)

On 5/8/16 11:29 AM, Stephen Frost wrote:

My suggestion is that, from this point forward, we add new tests to

9.6 only if they are closely related to a bug that is getting fixed or
a feature that is new in 9.6. I think that's a reasonable compromise,
but what do others think?

I'm willing to accept that compromise, but I'm not thrilled with it due
to what it will mean for the process I'm currently going through. The
approach I've been using has been to add tests to gain more code
coverage of the code in pg_dump. That has turned up multiple
pre-existing bugs in pg_dump but the vast majority of the tests come
back clean. This compromise would mean that I'd continue to work
through the code coverage tests, but would have to segregate out and
commit only those tests which actually reveal bugs, once those bugs have
been fixed (as to avoid turning the buildfarm red). The rest of the
tests would still get written, but since they currently don't reveal
bugs, they would be shelved until development is opened for 9.7.

Having done extensive database unit testing in the past, I've
experienced what Stephen's talking about and agree it's a real pain.
With tap-style tests (or really anything that's not dependent on
something as fragile as diffing output), there's pretty low risk to
adding more tests that are passing.

As for tests distracting people from reviewing patches, robust tests
significantly reduce the need for manual review. I think it's a much
better approach to take a methodical approach of writing tests to help
verify a feature works than just randomly banging on it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Noah Misch
noah@leadboat.com
In reply to: Stephen Frost (#3)

On Sun, May 08, 2016 at 12:29:27PM -0400, Stephen Frost wrote:

* Robert Haas (robertmhaas@gmail.com) wrote:

My suggestion is that, from this point forward, we add new tests to
9.6 only if they are closely related to a bug that is getting fixed or
a feature that is new in 9.6. I think that's a reasonable compromise,
but what do others think?

+1. This is a natural extension of the well-established default that we
(back-)patch tests for a bug into all releases getting a fix for the bug.

I'm willing to accept that compromise, but I'm not thrilled with it due
to what it will mean for the process I'm currently going through. The
approach I've been using has been to add tests to gain more code
coverage of the code in pg_dump. That has turned up multiple
pre-existing bugs in pg_dump but the vast majority of the tests come
back clean. This compromise would mean that I'd continue to work
through the code coverage tests, but would have to segregate out and
commit only those tests which actually reveal bugs, once those bugs have
been fixed (as to avoid turning the buildfarm red). The rest of the
tests would still get written, but since they currently don't reveal
bugs, they would be shelved until development is opened for 9.7.

Some or even most of the other tests would qualify under "closely related to
... a feature that is new in 9.6". Your 9.6 pg_dump changes affected object
selection and catalog extraction for most object types, so I think validating
those paths is in scope under Robert's suggestion. Testing "pg_dump
--encoding" or "pg_dump --jobs" probably wouldn't fall in scope, because those
features operate at arm's length from the 9.6 pg_dump changes. Expanding, for
example, tests of postgres_fdw query deparse would certainly fall out of
scope. That would have no apparent chance of catching a regression caused by
the 9.6 pg_dump changes.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#5)

On Sun, May 22, 2016 at 11:22 PM, Noah Misch <noah@leadboat.com> wrote:

Some or even most of the other tests would qualify under "closely related to
... a feature that is new in 9.6". Your 9.6 pg_dump changes affected object
selection and catalog extraction for most object types, so I think validating
those paths is in scope under Robert's suggestion. Testing "pg_dump
--encoding" or "pg_dump --jobs" probably wouldn't fall in scope, because those
features operate at arm's length from the 9.6 pg_dump changes. Expanding, for
example, tests of postgres_fdw query deparse would certainly fall out of
scope. That would have no apparent chance of catching a regression caused by
the 9.6 pg_dump changes.

...although it might catch bugs in the deparsing logic, which was
heavily revised in 9.6.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#6)

On Tue, May 24, 2016 at 08:19:20PM -0400, Robert Haas wrote:

On Sun, May 22, 2016 at 11:22 PM, Noah Misch <noah@leadboat.com> wrote:

Some or even most of the other tests would qualify under "closely related to
... a feature that is new in 9.6". Your 9.6 pg_dump changes affected object
selection and catalog extraction for most object types, so I think validating
those paths is in scope under Robert's suggestion. Testing "pg_dump
--encoding" or "pg_dump --jobs" probably wouldn't fall in scope, because those
features operate at arm's length from the 9.6 pg_dump changes. Expanding, for
example, tests of postgres_fdw query deparse would certainly fall out of
scope. That would have no apparent chance of catching a regression caused by
the 9.6 pg_dump changes.

...although it might catch bugs in the deparsing logic, which was
heavily revised in 9.6.

True. I cancel my last two sentences above; that was a weak choice of
example, and the surviving sentences convey the message adequately.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers