Reducing the runtime of the core regression tests

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Andres Freund (#2)

Re: Reducing the runtime of the core regression tests

Andres Freund <andres@anarazel.de> writes:

On 2019-04-10 18:35:15 -0400, Tom Lane wrote:

... What I did instead was to shove
that test case and some related ones into a new plpgsql test file,
src/pl/plpgsql/src/sql/plpgsql_trap.sql, so that it's not part of the
core regression tests at all. (We've talked before about moving
chunks of plpgsql.sql into the plpgsql module, so this is sort of a
down payment on that.) Now, if you think about the time to do
check-world rather than just the core regression tests, this isn't
obviously a win, and in fact it might be a loss because the plpgsql
tests run serially not in parallel with anything else.

Hm, can't we "just" parallelize the plpgsql schedule instead?

If somebody wants to work on that, I won't stand in the way, but
it seems like material for a different patch.

Thoughts? Anyone object to making these sorts of changes
post-feature-freeze?

Hm. There's some advantage to doing so, because it won't break any large
pending changes. But it's also possible that it'll destabilize the
buildfarm some. In personal capacity I'm like +0.5.

My thought was that there is (hopefully) going to be a lot of testing
going on over the next few months, so making that faster would be
a useful activity.

regards, tom lane

pg@bowt.ie

about 7 years ago

In reply to: Tom Lane (#1)

Re: Reducing the runtime of the core regression tests

On Wed, Apr 10, 2019 at 3:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I finally got some time to pursue that, and attached is a proposed patch
that moves some tests around and slightly adjusts some other ones.
To cut to the chase: on my workstation, this cuts the time for
"make installcheck-parallel" from 21.9 sec to 13.9 sec, or almost 40%.
I think that's a worthwhile improvement, considering how often all of us
run those tests.

Great!

* create_index.sql ran much longer than other tests in its parallel
group, so I split out the SP-GiST-related tests into a new file
create_index_spgist.sql, and moved the delete_test_table test case
to btree_index.sql.

Putting the delete_test_table test case in btree_index.sql make perfect sense.

* Likewise, I split up indexing.sql by moving the "fastpath" test into
a new file index_fastpath.sql.

I just noticed that the "fastpath" test actually fails to test the
fastpath optimization -- the coverage we do have comes from another
test in btree_index.sql, that I wrote back in December. While I did
make a point of ensuring that we had test coverage for the nbtree
fastpath optimization that went into Postgres 11, I also didn't
consider the original fastpath test. I assumed that there were no
tests to begin with, because gcov showed me that there was no test
coverage back in December.

What happened here was that commit 074251db limited the fastpath to
only be applied to B-Trees with at least 3 levels. While the original
fastpath optimization tests actually tested the fastpath optimization
when it first went in in March 2018, that only lasted a few weeks,
since 074251db didn't adjust the test to still be effective.

I'll come up with a patch to deal with this situation, by
consolidating the old and new tests in some way. I don't think that
your work needs to block on that, though.

Thoughts? Anyone object to making these sorts of changes
post-feature-freeze?

IMV there should be no problem with pushing ahead with this after
feature freeze.

--
Peter Geoghegan

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Peter Geoghegan (#4)

Re: Reducing the runtime of the core regression tests

Peter Geoghegan <pg@bowt.ie> writes:

On Wed, Apr 10, 2019 at 3:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

* Likewise, I split up indexing.sql by moving the "fastpath" test into
a new file index_fastpath.sql.

I just noticed that the "fastpath" test actually fails to test the
fastpath optimization -- the coverage we do have comes from another
test in btree_index.sql, that I wrote back in December.

Oh! Hmm.

I'll come up with a patch to deal with this situation, by
consolidating the old and new tests in some way. I don't think that
your work needs to block on that, though.

Should I leave out the part of my patch that creates index_fastpath.sql?
If we're going to end up removing that version of the test, there's no
point in churning the related lines beforehand.

One way or the other I want to get that test out of where it is,
because indexing.sql is currently the slowest test in its group.
But if you prefer to make btree_index run a bit longer rather than
inventing a new test script, that's no problem from where I stand.

regards, tom lane

pg@bowt.ie

about 7 years ago

In reply to: Tom Lane (#5)

Re: Reducing the runtime of the core regression tests

On Wed, Apr 10, 2019 at 4:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'll come up with a patch to deal with this situation, by
consolidating the old and new tests in some way. I don't think that
your work needs to block on that, though.

Should I leave out the part of my patch that creates index_fastpath.sql?
If we're going to end up removing that version of the test, there's no
point in churning the related lines beforehand.

The suffix truncation stuff made it tricky to force a B-Tree to be
tall without also consisting of many blocks. Simply using large,
random key values in suffix attributes didn't work anymore. The
solution I came up with for the new fastpath test that made it into
btree_index.sql was to have redundancy in leading keys, while avoiding
TOAST compression by using plain storage in the table/index.

One way or the other I want to get that test out of where it is,
because indexing.sql is currently the slowest test in its group.
But if you prefer to make btree_index run a bit longer rather than
inventing a new test script, that's no problem from where I stand.

The original fastpath tests don't seem particularly effective to me,
even without the oversight I mentioned. I suggest that you remove
them, since the minimal btree_index.sql fast path test is sufficient.
If there ever was a problem in this area, then amcheck would certainly
detect it -- there is precisely one place for every tuple on v4
indexes. The original fastpath tests won't tickle the implementation
in any interesting way in my opinion.

--
Peter Geoghegan

pg@bowt.ie

about 7 years ago

In reply to: Peter Geoghegan (#6)

Re: Reducing the runtime of the core regression tests

On Wed, Apr 10, 2019 at 4:56 PM Peter Geoghegan <pg@bowt.ie> wrote:

The original fastpath tests don't seem particularly effective to me,
even without the oversight I mentioned. I suggest that you remove
them, since the minimal btree_index.sql fast path test is sufficient.

To be clear: I propose that you remove the tests entirely, and we
leave it at that. I don't intend to follow up with my own patch
because I don't think that there is anything in the original test case
that is worth salvaging.

--
Peter Geoghegan

David Rowley

dgrowleyml@gmail.com

about 7 years ago

In reply to: Tom Lane (#1)

Re: Reducing the runtime of the core regression tests

On Thu, 11 Apr 2019 at 10:35, Tom Lane <tgl@sss.pgh.pa.us> wrote:

In no particular order, here's what I did:

I was surprised to see nothing mentioned about attempting to roughly
sort the test order in each parallel group according to their runtime.
Shorter running test coming last should reduce the chances of one
process doing its last test when all other processes are already done
and sitting idle. Of course, this won't be consistent over all
hardware, but maybe it could be done as an average time for each test
over the entire buildfarm.

Thoughts? Anyone object to making these sorts of changes
post-feature-freeze?

I think it's a good time to do this sort of thing. It should be
easier to differentiate tests destabilising due to this change out
from the noise of other changes that are going in.... since currently,
the rate of those other changes should not be very high. Doing it any
later in the freeze does not seem better since we might discover some
things that need to be fixed due to this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: David Rowley (#8)

Re: Reducing the runtime of the core regression tests

David Rowley <david.rowley@2ndquadrant.com> writes:

I was surprised to see nothing mentioned about attempting to roughly
sort the test order in each parallel group according to their runtime.

I'm confused about what you have in mind here? I'm pretty sure pg_regress
launches all the scripts in a group at the same time, so that just
rearranging the order they're listed in on the schedule line shouldn't
make any noticeable difference. If you meant changing the order of
operations within each script, I don't really want to go there.
It'd require careful per-script analysis, and to the extent that the
existing tests have some meaningful ordering (admittedly, many don't),
we'd lose that.

Thoughts? Anyone object to making these sorts of changes
post-feature-freeze?

I think it's a good time to do this sort of thing. It should be
easier to differentiate tests destabilising due to this change out
from the noise of other changes that are going in.... since currently,
the rate of those other changes should not be very high. Doing it any
later in the freeze does not seem better since we might discover some
things that need to be fixed due to this.

Yeah. I wouldn't propose pushing this in shortly before beta, but
if we do it now then we've probably got a month to sort out any
problems that may appear.

regards, tom lane

#10

David Rowley

dgrowleyml@gmail.com

about 7 years ago

In reply to: Tom Lane (#9)

Re: Reducing the runtime of the core regression tests

On Thu, 11 Apr 2019 at 16:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

I was surprised to see nothing mentioned about attempting to roughly
sort the test order in each parallel group according to their runtime.

I'm confused about what you have in mind here? I'm pretty sure pg_regress
launches all the scripts in a group at the same time, so that just
rearranging the order they're listed in on the schedule line shouldn't
make any noticeable difference.

I probably have looked closer to how that is handled. If they're all
launched at once then there's no point to what I mentioned. Please
disregard.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#11

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Peter Geoghegan (#7)

Re: Reducing the runtime of the core regression tests

Peter Geoghegan <pg@bowt.ie> writes:

On Wed, Apr 10, 2019 at 4:56 PM Peter Geoghegan <pg@bowt.ie> wrote:

The original fastpath tests don't seem particularly effective to me,
even without the oversight I mentioned. I suggest that you remove
them, since the minimal btree_index.sql fast path test is sufficient.

To be clear: I propose that you remove the tests entirely, and we
leave it at that. I don't intend to follow up with my own patch
because I don't think that there is anything in the original test case
that is worth salvaging.

I checked into this by dint of comparing "make coverage" output for
"make check" runs with and without indexing.sql's fastpath tests.
There were some differences that seem mostly to be down to whether
or not autovacuum hit particular code paths during the test run.
In total, I found 29 lines that were hit in the first test but not
in the second ... and 141 lines that were hit in the second test
but not the first. So I concur that indexing.sql's fastpath test
isn't adding anything useful coverage-wise, and will just nuke it.

(It'd be interesting perhaps to check whether the results shown
by coverage.postgresql.org are similarly unstable. They might be
less so, since I believe those are taken over the whole check-world
suite not just the core regression tests.)

regards, tom lane

#12

pg@bowt.ie

about 7 years ago

In reply to: Tom Lane (#11)

Re: Reducing the runtime of the core regression tests

On Thu, Apr 11, 2019 at 9:55 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So I concur that indexing.sql's fastpath test
isn't adding anything useful coverage-wise, and will just nuke it.

Good.

(It'd be interesting perhaps to check whether the results shown
by coverage.postgresql.org are similarly unstable. They might be
less so, since I believe those are taken over the whole check-world
suite not just the core regression tests.)

I'm almost certain that they're at least slightly unstable. I mostly
find the report useful because it shows whether or not something gets
hit at all. I don't trust it to be very accurate.

I've noticed that the coverage reported on coverage.postgresql.org
sometimes looks contradictory, which can happen due to compiler
optimizations. I wonder if that could be addressed in some way,
because I find the site to be a useful resource. I would at least like
to know the settings used by its builds.

--
Peter Geoghegan

#13

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Geoghegan (#12)

Re: Reducing the runtime of the core regression tests

On 2019-Apr-11, Peter Geoghegan wrote:

I've noticed that the coverage reported on coverage.postgresql.org
sometimes looks contradictory, which can happen due to compiler
optimizations. I wonder if that could be addressed in some way,
because I find the site to be a useful resource. I would at least like
to know the settings used by its builds.

./configure --enable-depend --enable-coverage --enable-tap-tests --enable-nls --with-python --with-perl --with-tcl --with-openssl --with-libxml --with-ldap --with-pam >> $LOG 2>&1

make -j4 >> $LOG 2>&1
make -j4 -C contrib >> $LOG 2>&1
make check-world PG_TEST_EXTRA="ssl ldap" >> $LOG 2>&1
make coverage-html >> $LOG 2>&1

There are no environment variables that would affect it.

If you want to propose changes, feel free.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#14

https://gcc.gnu.org/onlinedocs/gcc/Gcov-and-Optimization.html#Gcov-and-Optimization

pg@bowt.ie

about 7 years ago

In reply to: Alvaro Herrera (#13)

Re: Reducing the runtime of the core regression tests

On Thu, Apr 11, 2019 at 11:00 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

./configure --enable-depend --enable-coverage --enable-tap-tests --enable-nls --with-python --with-perl --with-tcl --with-openssl --with-libxml --with-ldap --with-pam >> $LOG 2>&1

make -j4 >> $LOG 2>&1
make -j4 -C contrib >> $LOG 2>&1
make check-world PG_TEST_EXTRA="ssl ldap" >> $LOG 2>&1
make coverage-html >> $LOG 2>&1

There are no environment variables that would affect it.

Could we add "CFLAGS=-O0"? This should prevent the kind of
machine-wise line-counting described here:

I think that it makes sense to prioritize making it clear which exact
lines were executed in terms of the semantics of C. I might prefer to
have optimizations enabled if I was optimizing my code, but that's not
what the web resource is for, really.

Thanks
--
Peter Geoghegan

#15

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Geoghegan (#14)

Re: Reducing the runtime of the core regression tests

On 2019-Apr-11, Peter Geoghegan wrote:

On Thu, Apr 11, 2019 at 11:00 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

./configure --enable-depend --enable-coverage --enable-tap-tests --enable-nls --with-python --with-perl --with-tcl --with-openssl --with-libxml --with-ldap --with-pam >> $LOG 2>&1

make -j4 >> $LOG 2>&1
make -j4 -C contrib >> $LOG 2>&1
make check-world PG_TEST_EXTRA="ssl ldap" >> $LOG 2>&1
make coverage-html >> $LOG 2>&1

There are no environment variables that would affect it.

Could we add "CFLAGS=-O0"?

Done. Do you have a preferred spot where the counts were wrong?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#16

pg@bowt.ie

about 7 years ago

In reply to: Alvaro Herrera (#15)

Re: Reducing the runtime of the core regression tests

On Fri, Apr 12, 2019 at 9:31 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Done. Do you have a preferred spot where the counts were wrong?

Not really, but I can give you an example.

Line counts for each of the two "break" statements within
_bt_keep_natts_fast() are exactly the same. I don't think that this
because we actually hit each break exactly the same number of times
(90,236 times). I think that we see this because the same instruction
is associated with both break statements in the loop. All of the
examples I've noticed are a bit like that. Not a huge problem, but
less useful than the alternative.

--
Peter Geoghegan

#17

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Geoghegan (#16)

Re: Reducing the runtime of the core regression tests

On 2019-Apr-12, Peter Geoghegan wrote:

On Fri, Apr 12, 2019 at 9:31 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Done. Do you have a preferred spot where the counts were wrong?

Not really, but I can give you an example.

Line counts for each of the two "break" statements within
_bt_keep_natts_fast() are exactly the same. I don't think that this
because we actually hit each break exactly the same number of times
(90,236 times). I think that we see this because the same instruction
is associated with both break statements in the loop. All of the
examples I've noticed are a bit like that. Not a huge problem, but
less useful than the alternative.

Hmm, it's odd, because
https://coverage.postgresql.org/src/backend/access/nbtree/nbtutils.c.gcov.html
still shows that function doing that. pg_config shows:

$ ./pg_config --configure
'--enable-depend' '--enable-coverage' '--enable-tap-tests' '--enable-nls' '--with-python' '--with-perl' '--with-tcl' '--with-openssl' '--with-libxml' '--with-ldap' '--with-pam' 'CFLAGS=-O0'

src/Makefile.global says:

CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fprofile-arcs -ftest-coverage -O0

the compile line for nbtutils is:

gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fprofile-arcs -ftest-coverage -O0 -I../../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2 -c -o nbtutils.o nbtutils.c -MMD -MP -MF .deps/nbtutils.Po

so I suppose there's something else that's affecting this.

I wonder if it would be useful to add --enable-debug. I think I
purposefully removed that, but I don't remember any details about it.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18

pg@bowt.ie

about 7 years ago

In reply to: Alvaro Herrera (#17)

Re: Reducing the runtime of the core regression tests

On Fri, Apr 12, 2019 at 10:24 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I wonder if it would be useful to add --enable-debug. I think I
purposefully removed that, but I don't remember any details about it.

As usual, this stuff is horribly under-documented. I think it's
possible that --enable-debug would help, since llvm-gcov requires it,
but that doesn't seem particularly likely.

It's definitely generally recommended that "-O0" be used, so I think
that we can agree that that was an improvement, even if it doesn't fix
the remaining problem that I noticed when I rechecked nbtutils.c.

--
Peter Geoghegan

#19

pg@bowt.ie

about 7 years ago

In reply to: Peter Geoghegan (#18)

Re: Reducing the runtime of the core regression tests

On Fri, Apr 12, 2019 at 10:49 AM Peter Geoghegan <pg@bowt.ie> wrote:

It's definitely generally recommended that "-O0" be used, so I think
that we can agree that that was an improvement, even if it doesn't fix
the remaining problem that I noticed when I rechecked nbtutils.c.

I'm not sure that we can really assume that "-O0" avoids the behavior
I pointed out. Perhaps this counts as "semantic flattening" or
something, rather than an optimization. I could have easily written
the code in _bt_keep_natts_fast() in the way gcov/gcc/whatever thinks
I ought to have, which would have obscured the distinction anyway.

--
Peter Geoghegan

#20

pg@bowt.ie

about 7 years ago

In reply to: Alvaro Herrera (#17)

Re: Reducing the runtime of the core regression tests

On Fri, Apr 12, 2019 at 10:24 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Hmm, it's odd, because
https://coverage.postgresql.org/src/backend/access/nbtree/nbtutils.c.gcov.html
still shows that function doing that. pg_config shows:

$ ./pg_config --configure
'--enable-depend' '--enable-coverage' '--enable-tap-tests' '--enable-nls' '--with-python' '--with-perl' '--with-tcl' '--with-openssl' '--with-libxml' '--with-ldap' '--with-pam' 'CFLAGS=-O0'

So, we're currently using this on coverage.postgresql.org? We've switched?

I noticed a better example of weird line counts today, this time
within _bt_check_rowcompare():

1550 4 : cmpresult = 0;
1551 4 : if (subkey->sk_flags & SK_ROW_END)
1552 1292 : break;
1553 0 : subkey++;
1554 0 : continue;

I would expect the "break" statement to have a line count that is no
greater than that of the first two lines that immediately precede, and
yet it's far far greater (1292 is greater than 4). It looks like there
has been some kind of loop transformation.

--
Peter Geoghegan

#21

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Geoghegan (#20)

#22