Refactoring the regression tests for more independence

Started by Tom Laneover 4 years ago10 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Noah suggested in [1]/messages/by-id/20211217182518.GA2529654@rfd.leadboat.com that we should make an effort to allow any one
of the core regression tests to be run mostly standalone (i.e., after
running only the test_setup script), so as to allow quicker iterations
when adjusting a script. This'd presumably also lead to the tests
being more independent, which seems like a good thing. I spent a
bit of time looking into this idea, and attached are a couple of
draft patches for discussion.

I soon realized that complete independence was probably infeasible,
and not very useful anyway. Notably, it doesn't seem useful to get
rid of the geometry script's dependencies on the per-geometric-type
scripts, nor of horology's dependencies on the per-datetime-type
scripts. I suppose we could think of just merging the per-type
scripts into geometry and horology, but that does not seem like an
improvement. So my goal here is to get rid of *most* dependencies,
and ensure that the remainder are documented in the parallel_schedule
file. Also note that I'm explicitly not promising that the tests
can now run in any order --- I've made no attempt to get rid of
"A can't run before B" or "A can't run concurrently with B"
restrictions.

0001 below gets rid of dependencies on the create_function_N scripts,
by moving functions they define into either the particular script
that uses the function (for the ones referenced in only one script,
which is most) or into the test_setup script. It turns out that
create_function_1 and create_function_2 go away entirely, because
nothing's left. While I've not done so here, I'm tempted to rename
create_function_0 to create_function_c and create_function_3 to
create_function_sql, to give them better-defined charters and
eliminate the confusion with trailing digits for variant files.
(With that division of labor in mind, 0001 does move a couple of
SQL functions from create_function_0 to create_function_3.)

0001 also moves some hash functions that were created in insert.sql
into test_setup, because they were also used elsewhere. I also
cleaned up some other type-related script interdependencies, by
consolidating the "widget"-related code into create_type, removing
a dependency on the custom path ## path operator in favor of the
equivalent built-in ?# operator, and declaring the textrange and
float8range types in test_setup. Lastly, 0001 fixes the
tab_core_types test case in type_sanity so that it only covers
built-in types, not types that randomly happen to be created in
test scripts that run before type_sanity.

0002 performs a similar set of transformations to get rid of
table-related script interdependencies. I identified a dozen or so
tables that are used in multiple scripts and (for the most part)
are not modified once filled. I moved the creation and filling of
those into test_setup. There were also some tables that were really
only used in one script, so I could move their creation and filling to
that script, leaving no cross-script dependencies on create_table.sql
or copy.sql. I made some other adjustments to get rid of incidental
cross-script dependencies. There are a lot more judgment calls in
0002 than 0001, though, so people might have objections or better
ideas. Notably:

* A few scripts insisted on modifying the "shared" tables, which
seemed like something to get rid of. What I did, to minimize the
diffs in these scripts, was to make them create temporary tables
of the same names and then scribble on the temp tables. There's
an argument to be made that this will be too confusing and we'd be
better off changing the scripts to use different names for these
local tables. That'd make the patch even bulkier, though.

* create_index made some indexes on circle_tbl and polygon_tbl,
which I didn't want to treat as shared tables. I moved those
indexes and the associated test queries to the end of geometry.sql.
They could have been made in circle.sql and polygon.sql,
but I was worried that that would possibly change plans for
existing queries in geometry.sql.

* create_index also had some queries on array_op_test, which
I'm now treating as private to arrays.sql. The purpose of
those was to compare index-free results to indexable queries
on array_index_op_test, which is now private to create_index.
So what I did was to replace those by doing the same queries
on array_index_op_test before building its indexes. This is
a better way anyway since it doesn't require the unstated
assumption that array_op_test and array_index_op_test
contain identical data.

* The situation with a_star and its child tables was a bit of a mess.
They were created in create_table.sql, populated in create_misc.sql,
then misc.sql did significant DDL on them, and finally select_parallel
used them in queries (and would fail outright if the DDL changes
hadn't been made). What I've done here is to move the create_table
and misc steps into create_misc, and then allow select_parallel to
depend on create_misc. You could argue for chopping that up
differently, perhaps, but I'm not seeing alternatives I like better.

* Having established the precedent that I'd allow some cross-script
dependencies on create_misc, I adjusted a couple of places that
were depending on the "b" table made by inherit.sql to depend on
create_misc's b_star, which has just about the same schema including
children. I figured multiple dependencies on create_misc was better
than some on create_misc and some on inherit. (So maybe there's
a case for moving that entire sequence into test_setup? But it
seems like a big hunk that doesn't belong there.)

* Another table with an unreasonably large footprint was the "tmp"
table made (but not used) in select.sql, used in select_distinct and
select_distinct_on, and then modified and eventually dropped in
misc.sql. It's just luck this doesn't collide with usages of
tables named "tmp" in some other scripts. Since "tmp" is just a
copy of some columns from "onek", I adjusted select_distinct and
select_distinct_on to select from "onek" instead, and then
consolidated the usage of the table into misc.sql. (I'm half
tempted to drop the table and test cases from misc altogether.
The comments there indicate that this is a 25-year-old test for
some b-tree problem or other --- but tmp has no indexes, so it
can't any longer be testing what it was intended to. But removing
test cases is not in the charter of this patch series, I guess.)

* expressions.sql had some BETWEEN tests depending on date_tbl,
which I resolved by moving those tests to horology.sql. We could
alternatively change them to use some other table/datatype, or
just accept the extra dependency.

* The rules and sanity_check scripts are problematic because
their results depend heavily on just which scripts execute
before them. In this patch I've adopted a big hammer:
I trimmed rules' output by restricting it to only print
info about pg_catalog relations, and I dropped the troublesome
sanity_check query altogether. I don't think that sanity_check
query has any real use, certainly not enough to justify the
maintenance effort we've put into it over the years. Maybe
there's an objection to restricting the coverage of rules,
though. (One idea to exercise ruleutils.c more is to let that
query cover information_schema as well as pg_catalog. Local
code-coverage testing says there's not much difference, though.)

Some things I'm not totally happy about:

* Testing shows that quite a few scripts have dependencies on
create_index, because their EXPLAIN output or row output order
varies if the indexes aren't there. This dependency could
likely be removed by moving creation of some of the indexes on
the "shared" tables into test_setup, but I'm unconvinced whether
that's a good thing to do or not. I can live with documenting
create_index as a common dependency.

* I treated point_tbl as a shared table, but I'm not sure that's a
great idea, especially since the non-geometry consumers of point_tbl
both want to scribble on it. Doing something else would be more
invasive though.

* psql.sql has a dependency on create_am, because the "heap2" access
method that that creates shows up in psql's output. This seems fairly
annoying, since there's no good semantic excuse for such coupling.
One quick-and-dirty workaround could be to run the psql test before
create_am.

* amutils depends on indexes from all over the map, so it
has a rather horrid dependency list. Perhaps we should change
it to print info about indexes it manufactures locally.

Thoughts?

regards, tom lane

PS: To save anyone else the work of reinventing it, I attach
a script I used to confirm that the modified test scripts have
no unexpected dependencies. I don't propose to commit this,
especially not in its current hacky state of overwriting the
parallel_schedule file. (Maybe we should provide a way to
run specified test script(s) *without* invoking the whole
schedule first.)

[1]: /messages/by-id/20211217182518.GA2529654@rfd.leadboat.com

Attachments:

0001-refactor-functions.patchtext/x-diff; charset=us-ascii; name=0001-refactor-functions.patchDownload+619-464
0002-refactor-tables.patchtext/x-diff; charset=us-ascii; name=0002-refactor-tables.patchDownload+1767-2035
script-dep-testing.sh.gzapplication/x-gzip; name=script-dep-testing.sh.gzDownload
#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#1)
Re: Refactoring the regression tests for more independence

On Fri, Dec 24, 2021 at 05:00:17PM -0500, Tom Lane wrote:

While I've not done so here, I'm tempted to rename
create_function_0 to create_function_c and create_function_3 to
create_function_sql, to give them better-defined charters and
eliminate the confusion with trailing digits for variant files.

+1

(Maybe we should provide a way to run specified test script(s) *without*
invoking the whole schedule first.)

+1 ; it can be done later, though.

It's nice to be able to get feedback within a few seconds. That supports the
idea of writing tests earlier.

I guess this may expose some instabilities due to timing of autovacuum (which
I'd say is a good thing).

If you rearrange the creation of objects, that may provide an opportunity to
rename some tables with too-short names, since backpatching would already have
conflicts.

--
Justin

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#2)
Re: Refactoring the regression tests for more independence

Not too surprisingly, these patches broke during the commitfest.
Here's a rebased version.

I'm not sure that anyone wants to review these in detail ...
should I just go ahead and push them?

regards, tom lane

Attachments:

0001-refactor-functions-2.patchtext/x-diff; charset=us-ascii; name=0001-refactor-functions-2.patchDownload+628-473
0002-refactor-tables-2.patchtext/x-diff; charset=us-ascii; name=0002-refactor-tables-2.patchDownload+1767-2035
#4Julien Rouhaud
rjuju123@gmail.com
In reply to: Tom Lane (#3)
Re: Refactoring the regression tests for more independence

On Mon, Feb 07, 2022 at 02:00:25PM -0500, Tom Lane wrote:

Not too surprisingly, these patches broke during the commitfest.
Here's a rebased version.

I'm not sure that anyone wants to review these in detail ...
should I just go ahead and push them?

I don't see anything shocking after a quick glance, and I don't think any
review is going to give any more confidence compared to the script-dep-testing
script, so +1 for pushing them since the cf bot is green again.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Julien Rouhaud (#4)
Re: Refactoring the regression tests for more independence

Julien Rouhaud <rjuju123@gmail.com> writes:

On Mon, Feb 07, 2022 at 02:00:25PM -0500, Tom Lane wrote:

Not too surprisingly, these patches broke during the commitfest.
Here's a rebased version.
I'm not sure that anyone wants to review these in detail ...
should I just go ahead and push them?

I don't see anything shocking after a quick glance, and I don't think any
review is going to give any more confidence compared to the script-dep-testing
script, so +1 for pushing them since the cf bot is green again.

Done, will watch the farm.

regards, tom lane

#6Aleksander Alekseev
aleksander@timescale.com
In reply to: Tom Lane (#5)
Re: Refactoring the regression tests for more independence

Hi hackers,

I don't see anything shocking after a quick glance, and I don't think any
review is going to give any more confidence compared to the script-dep-testing
script, so +1 for pushing them since the cf bot is green again.

Done, will watch the farm.

I wanted to test one of the patches we have for the July CF on the
Raspberry Pi 3 Model B+. It runs Raspbian GNU/Linux 10 (buster) and
Linux Kernel 5.10.60-v7+.

I discovered that the PostgreSQL tests don't pass in this environment.
Using `git bisect` I was able to pinpoint the problem to cc50080a and
found this thread. Currently `REL_15_STABLE` and `master` are
affected.

To build PostgreSQL I use my regular set of scripts [1]https://github.com/afiskon/pgscripts/ and the
following command:

```
./quick-build.sh && ./single-install.sh && make installcheck
``

regression.diffs and regression.out are attached. The same tests pass
just fine on Ubuntu Linux and MacOS.

I didn't investigate the problem further since it's pretty late in my
timezone. I just wanted to share my current discoveries with you.
Since we have several agents on buildfarm running Raspbian that are
pretty happy with the patch I would guess this may have something to
do with particular flags and/or configure options I'm using.

[1]: https://github.com/afiskon/pgscripts/

--
Best regards,
Aleksander Alekseev

Attachments:

regression.tgzapplication/x-gzip; name=regression.tgzDownload+1-0
#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Aleksander Alekseev (#6)
Re: Refactoring the regression tests for more independence

Aleksander Alekseev <aleksander@timescale.com> writes:

I wanted to test one of the patches we have for the July CF on the
Raspberry Pi 3 Model B+. It runs Raspbian GNU/Linux 10 (buster) and
Linux Kernel 5.10.60-v7+.
I discovered that the PostgreSQL tests don't pass in this environment.

Since you haven't explained what's different about this environment,
it's hard to comment on these results. But is this really a stock
Postgres source tree, with no local modifications? The fragment of
src/test/regress/expected/copy.out that you show does not look
current.

regards, tom lane

#8Aleksander Alekseev
aleksander@timescale.com
In reply to: Tom Lane (#7)
Re: Refactoring the regression tests for more independence

Hi Tom,

Since you haven't explained what's different about this environment,
it's hard to comment on these results. But is this really a stock
Postgres source tree, with no local modifications? The fragment of
src/test/regress/expected/copy.out that you show does not look
current.

Yes, this is a stock PostgreSQL source code without any modification,
with `git clean -dfx` etc.

The fragment of copy.out probably doesn't look current because I was
using `git bisect` and I'm on cc50080a82 right now. However the same
tests fail on both `master` and `REL_15_STABLE`. It takes a while on
Raspberry Pi to rebuild Postgres :)

To clarify, the step that is failing is `./quick-build.sh`, or `make
check' in this script to be precise. So postgresql.conf I'm using in
single-install.sh has nothing to do with the problem, this step is not
reached.

Sorry about the confusion regarding the environment differences. GCC
version is 8.3.0, Perl 5.28.1. All in all this is pretty much the
default Raspbian 10 environment, something you would typically get
after setting up your RPi 3 B+ using Raspberry Pi Imager and running
`apt update`, nothing exotic. Please let me know if there are any
other details of interest.

I'll continue looking for the source of the problem and will post an
update as soon as I have one.

--
Best regards,
Aleksander Alekseev

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Aleksander Alekseev (#8)
Re: Refactoring the regression tests for more independence

Aleksander Alekseev <aleksander@timescale.com> writes:

Sorry about the confusion regarding the environment differences. GCC
version is 8.3.0, Perl 5.28.1. All in all this is pretty much the
default Raspbian 10 environment, something you would typically get
after setting up your RPi 3 B+ using Raspberry Pi Imager and running
`apt update`, nothing exotic. Please let me know if there are any
other details of interest.

FWIW, I tried to replicate this locally on my own RPi3B+, using
current Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-1066-raspi aarch64).
No luck: it all works fine for me. We have at least one Raspbian
buildfarm animal too, and it's not been unhappy either. I suspect
there is something odd about your environment settings.

regards, tom lane

#10Aleksander Alekseev
aleksander@timescale.com
In reply to: Tom Lane (#9)
Re: Refactoring the regression tests for more independence

Hi Tom,

FWIW, I tried to replicate this locally on my own RPi3B+, using
current Ubuntu 20.04.4 LTS (GNU/Linux 5.4.0-1066-raspi aarch64).
No luck: it all works fine for me. We have at least one Raspbian
buildfarm animal too, and it's not been unhappy either. I suspect
there is something odd about your environment settings.

Thanks for sharing this.

I repeated the experiment in a clean environment (Raspbian installed
from scratch on a brand new SD-card) and can confirm that the problem
is gone.

Sorry for the disturbance.

--
Best regards,
Aleksander Alekseev