Adding "large" to PG_TEST_EXTRA

Started by Andres Freundalmost 3 years ago15 messages
#1Andres Freund
andres@anarazel.de

Hi,

I'm working on rebasing [1]/messages/by-id/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de, my patch to make relation extension scale
better.

As part of that I'd like to add tests for relation extension. To be able to
test the bulk write strategy path, we need to have a few backends concurrently
load > 16MB files.

It seems pretty clear that doing that on all buildfarm machines wouldn't be
nice / welcome. And it also seems likely that this won't be the last case
where that'd be useful.

So I'd like to add a 'large' class to PG_TEST_EXTRA, that we can use in tests
that we only want to execute on machines with sufficient resources.

Makes sense?

Greetings,

Andres Freund

[1]: /messages/by-id/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de

#2Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#1)
Re: Adding "large" to PG_TEST_EXTRA

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

I'm working on rebasing [1], my patch to make relation extension scale
better.

As part of that I'd like to add tests for relation extension. To be able to
test the bulk write strategy path, we need to have a few backends concurrently
load > 16MB files.

It seems pretty clear that doing that on all buildfarm machines wouldn't be
nice / welcome. And it also seems likely that this won't be the last case
where that'd be useful.

So I'd like to add a 'large' class to PG_TEST_EXTRA, that we can use in tests
that we only want to execute on machines with sufficient resources.

Makes sense?

+1 in general. Are there existing tests that we should add into that
set that you're thinking of..? I've been working with the Kerberos
tests and that's definitely one that seems to fit this description...

Thanks,

Stephen

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#1)
Re: Adding "large" to PG_TEST_EXTRA

Andres Freund <andres@anarazel.de> writes:

As part of that I'd like to add tests for relation extension. To be able to
test the bulk write strategy path, we need to have a few backends concurrently
load > 16MB files.
It seems pretty clear that doing that on all buildfarm machines wouldn't be
nice / welcome. And it also seems likely that this won't be the last case
where that'd be useful.
So I'd like to add a 'large' class to PG_TEST_EXTRA, that we can use in tests
that we only want to execute on machines with sufficient resources.

Makes sense. I see that this approach would result in manual check-world
runs not running such tests by default either, which sounds right.

Bikeshedding a bit ... is "large" the right name? It's not awful but
I wonder if there is a better one; it seems like this class could
eventually include tests that run a long time but don't necessarily
eat disk space. "resource-intensive" is too long.

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#2)
Re: Adding "large" to PG_TEST_EXTRA

Hi,

On 2023-02-13 13:45:41 -0500, Stephen Frost wrote:

Are there existing tests that we should add into that set that you're
thinking of..? I've been working with the Kerberos tests and that's
definitely one that seems to fit this description...

I think the kerberos tests are already opt-in, so I don't think we need to
gate it further.

Maybe the pgbench tests?

I guess there's an argument to be made that we should use this for e.g.
002_pg_upgrade.pl or 027_stream_regress.pl - but I think both of these test
pretty fundamental behaviour like WAL replay, which is unfortunately is pretty
easy to break, so I'd be hesitant.

I guess we could stop running the full regression tests in 002_pg_upgrade.pl
if !large?

Greetings,

Andres Freund

#5Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: Adding "large" to PG_TEST_EXTRA

Hi,

On 2023-02-13 13:54:59 -0500, Tom Lane wrote:

Bikeshedding a bit ... is "large" the right name? It's not awful but
I wonder if there is a better one

I did wonder about that too. But didn't come up with something more poignant.

it seems like this class could eventually include tests that run a long time
but don't necessarily eat disk space. "resource-intensive" is too long.

I'm not sure we'd want to combine time-intensive and disk-space-intensive test
in the same category. Availability of disk space and cpu cycles don't have to
correlate that well.

lotsadisk, lotsacpu? :)

Greetings,

Andres Freund

#6Stephen Frost
sfrost@snowman.net
In reply to: Andres Freund (#4)
Re: Adding "large" to PG_TEST_EXTRA

Greetings,

* Andres Freund (andres@anarazel.de) wrote:

On 2023-02-13 13:45:41 -0500, Stephen Frost wrote:

Are there existing tests that we should add into that set that you're
thinking of..? I've been working with the Kerberos tests and that's
definitely one that seems to fit this description...

I think the kerberos tests are already opt-in, so I don't think we need to
gate it further.

I'd like to lump them in with a bunch of other tests though, to give it
more chance to run.. My issue currently is that they're *too* gated.

Maybe the pgbench tests?

Sure.

I guess there's an argument to be made that we should use this for e.g.
002_pg_upgrade.pl or 027_stream_regress.pl - but I think both of these test
pretty fundamental behaviour like WAL replay, which is unfortunately is pretty
easy to break, so I'd be hesitant.

Hm. If you aren't playing with that part of the code though, maybe it'd
be nice to not run them. The pg_dump tests might also make sense to
segregate out as they can add up to be a lot, and there's more that we
could and probably should be doing there.

I guess we could stop running the full regression tests in 002_pg_upgrade.pl
if !large?

Perhaps... but then what are we testing?

Thanks,

Stephen

#7Andres Freund
andres@anarazel.de
In reply to: Stephen Frost (#6)
Re: Adding "large" to PG_TEST_EXTRA

Hi,

On 2023-02-13 14:15:24 -0500, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

On 2023-02-13 13:45:41 -0500, Stephen Frost wrote:

Are there existing tests that we should add into that set that you're
thinking of..? I've been working with the Kerberos tests and that's
definitely one that seems to fit this description...

I think the kerberos tests are already opt-in, so I don't think we need to
gate it further.

I'd like to lump them in with a bunch of other tests though, to give it
more chance to run.. My issue currently is that they're *too* gated.

Isn't the reason that we gate them that much that the test poses a security
hazard on a multi-user system?

I don't think we should combine opting into security hazards with opting into
using disk space.

FWIW, the kerberos tests run on all CI OSs other than windows. I have
additional CI coverage for openbsd and netbsd in a separate branch, providing
further coverage - but I'm not sure we want those additional covered OSs in
core PG.

I think the tests for kerberos run frequently enough in practice. I don't know
how good the coverage they provide is, though, but that's a separate aspect to
improve anyway.

I guess there's an argument to be made that we should use this for e.g.
002_pg_upgrade.pl or 027_stream_regress.pl - but I think both of these test
pretty fundamental behaviour like WAL replay, which is unfortunately is pretty
easy to break, so I'd be hesitant.

Hm. If you aren't playing with that part of the code though, maybe it'd
be nice to not run them.

It's surprisingly easy to break it accidentally...

The pg_dump tests might also make sense to segregate out as they can add up
to be a lot, and there's more that we could and probably should be doing
there.

IMO the main issue with the pg_dump test is their verbosity, rather than the
runtime... ~8.8k subtests is a lot.

find . -name 'regress_log*'|xargs -n 1 wc -l|sort -nr|head -n 5|less
12712 ./testrun/pg_dump/002_pg_dump/log/regress_log_002_pg_dump
5124 ./testrun/pg_rewind/002_databases/log/regress_log_002_databases
1928 ./testrun/pg_rewind/001_basic/log/regress_log_001_basic
1589 ./testrun/recovery/017_shm/log/regress_log_017_shm
1077 ./testrun/pg_rewind/004_pg_xlog_symlink/log/regress_log_004_pg_xlog_symlink

I guess we could stop running the full regression tests in 002_pg_upgrade.pl
if !large?

Perhaps... but then what are we testing?

There's plenty to pg_upgrade other than the pg_dump comparison aspect.

I'm not sure it's worth spending too much energy finding tests that we can run
less commonly than now. We've pushed back on tests using lots of resources so
far, so we don't really have them...

Greetings,

Andres Freund

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#5)
Re: Adding "large" to PG_TEST_EXTRA

Andres Freund <andres@anarazel.de> writes:

On 2023-02-13 13:54:59 -0500, Tom Lane wrote:

it seems like this class could eventually include tests that run a long time
but don't necessarily eat disk space. "resource-intensive" is too long.

I'm not sure we'd want to combine time-intensive and disk-space-intensive test
in the same category. Availability of disk space and cpu cycles don't have to
correlate that well.

Yeah, I was thinking along the same lines.

lotsadisk, lotsacpu? :)

bigdisk, bigcpu?

regards, tom lane

#9Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#8)
Re: Adding "large" to PG_TEST_EXTRA

On 2023-02-13 14:55:45 -0500, Tom Lane wrote:

bigdisk, bigcpu?

Works for me.

I'll probably just add bigdisk as part of adding a test for bulk relation
extensions, mentioning in a comment that we might want bigcpu if we have a
test for it?

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#9)
Re: Adding "large" to PG_TEST_EXTRA

Andres Freund <andres@anarazel.de> writes:

On 2023-02-13 14:55:45 -0500, Tom Lane wrote:

bigdisk, bigcpu?

Works for me.

I'll probably just add bigdisk as part of adding a test for bulk relation
extensions, mentioning in a comment that we might want bigcpu if we have a
test for it?

No objection here.

regards, tom lane

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#7)
Re: Adding "large" to PG_TEST_EXTRA

On 2023-02-13 Mo 14:34, Andres Freund wrote:

Hi,

On 2023-02-13 14:15:24 -0500, Stephen Frost wrote:

* Andres Freund (andres@anarazel.de) wrote:

On 2023-02-13 13:45:41 -0500, Stephen Frost wrote:

Are there existing tests that we should add into that set that you're
thinking of..? I've been working with the Kerberos tests and that's
definitely one that seems to fit this description...

I think the kerberos tests are already opt-in, so I don't think we need to
gate it further.

I'd like to lump them in with a bunch of other tests though, to give it
more chance to run.. My issue currently is that they're *too* gated.

Isn't the reason that we gate them that much that the test poses a security
hazard on a multi-user system?

That's my understanding.

I don't think we should combine opting into security hazards with opting into
using disk space.

I agree

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#12Peter Smith
smithpb2250@gmail.com
In reply to: Andres Freund (#1)
Re: Adding "large" to PG_TEST_EXTRA

On Tue, Feb 14, 2023 at 5:42 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

I'm working on rebasing [1], my patch to make relation extension scale
better.

As part of that I'd like to add tests for relation extension. To be able to
test the bulk write strategy path, we need to have a few backends concurrently
load > 16MB files.

It seems pretty clear that doing that on all buildfarm machines wouldn't be
nice / welcome. And it also seems likely that this won't be the last case
where that'd be useful.

So I'd like to add a 'large' class to PG_TEST_EXTRA, that we can use in tests
that we only want to execute on machines with sufficient resources.

Oh, I was been thinking about a similar topic recently, but I was
unaware of PG_TEST_EXTRA [1]https://www.postgresql.org/docs/devel/regress-run.html

I've observed suggested test cases get rejected as being overkill, or
because they would add precious seconds to the test execution. OTOH, I
felt such tests would still help gain some additional percentages from
the "code coverage" stats. The kind of tests I am thinking of don't
necessarily need a huge disk/CPU - but they just take longer to run
than anyone has wanted to burden the build-farm with.

~

Sorry for the thread interruption -- but I thought this might be the
right place to ask: What is the recommended way to deal with such
tests intended primarily for better code coverage?

I didn't see anything that looked pre-built for 'coverage'. Did I miss
something, or is it a case of just invent-your-own extra tests/values
for your own ad-hoc requirements?

e.g.
make check EXTRA_TESTS=extra_regress_for_coverage
make check-world PG_TEST_EXTRA='extra_tap_tests_for_coverage'

Thanks!

------
[1]: https://www.postgresql.org/docs/devel/regress-run.html

Kind Regards,
Peter Smith.
Fujitsu Australia

#13Andres Freund
andres@anarazel.de
In reply to: Peter Smith (#12)
Re: Adding "large" to PG_TEST_EXTRA

Hi,

On 2023-02-14 09:26:47 +1100, Peter Smith wrote:

I've observed suggested test cases get rejected as being overkill, or
because they would add precious seconds to the test execution. OTOH, I
felt such tests would still help gain some additional percentages from
the "code coverage" stats. The kind of tests I am thinking of don't
necessarily need a huge disk/CPU - but they just take longer to run
than anyone has wanted to burden the build-farm with.

I'd say it depend on the test whether it's worth adding. Code coverage for its
own sake isn't that useful, they have to actually test something useful. And
tests have costs beyond runtime, e.g. new tests tend to fail in some edge
cases.

E.g. just having tests hit more lines, without verifying that the behaviour is
actually correct, only provides limited additional assurance. It's also not
very useful to add a very expensive test that provides only a very small
additional amount of coverage.

IOW, even if we add more test categories, it'll still be a tradeoff.

Sorry for the thread interruption -- but I thought this might be the
right place to ask: What is the recommended way to deal with such
tests intended primarily for better code coverage?

I don't think that exists today.

Do you have an example of the kind of test you're thinking of?

Greetings,

Andres Freund

#14Peter Smith
smithpb2250@gmail.com
In reply to: Andres Freund (#13)
Re: Adding "large" to PG_TEST_EXTRA

On Tue, Feb 14, 2023 at 10:44 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-02-14 09:26:47 +1100, Peter Smith wrote:

I've observed suggested test cases get rejected as being overkill, or
because they would add precious seconds to the test execution. OTOH, I
felt such tests would still help gain some additional percentages from
the "code coverage" stats. The kind of tests I am thinking of don't
necessarily need a huge disk/CPU - but they just take longer to run
than anyone has wanted to burden the build-farm with.

I'd say it depend on the test whether it's worth adding. Code coverage for its
own sake isn't that useful, they have to actually test something useful. And
tests have costs beyond runtime, e.g. new tests tend to fail in some edge
cases.

E.g. just having tests hit more lines, without verifying that the behaviour is
actually correct, only provides limited additional assurance. It's also not
very useful to add a very expensive test that provides only a very small
additional amount of coverage.

IOW, even if we add more test categories, it'll still be a tradeoff.

Sorry for the thread interruption -- but I thought this might be the
right place to ask: What is the recommended way to deal with such
tests intended primarily for better code coverage?

I don't think that exists today.

Do you have an example of the kind of test you're thinking of?

No, nothing specific in mind. But maybe like these:
- tests for causing obscure errors that would never otherwise be
reached without something deliberately designed to fail a certain way
- tests for trivial user errors apparently deemed not worth bloating
the regression tests with -- e.g. many errorConflictingDefElem not
being called [1]https://coverage.postgresql.org/src/backend/commands/subscriptioncmds.c.gcov.html.
- timing-related or error tests where some long (multi-second) delay
is a necessary part of the setup.

------
[1]: https://coverage.postgresql.org/src/backend/commands/subscriptioncmds.c.gcov.html

Kind Regards,
Peter Smith.
Fujitsu Australia

#15Andres Freund
andres@anarazel.de
In reply to: Peter Smith (#14)
Re: Adding "large" to PG_TEST_EXTRA

Hi,

On 2023-02-14 11:38:06 +1100, Peter Smith wrote:

No, nothing specific in mind. But maybe like these:
- tests for causing obscure errors that would never otherwise be
reached without something deliberately designed to fail a certain way

I think there's some cases around this that could be usefu, but also a lot
that wouldn't.

- tests for trivial user errors apparently deemed not worth bloating
the regression tests with -- e.g. many errorConflictingDefElem not
being called [1].

I don't think it's worth adding a tests for all of these. The likelihood of
catching a problem seems quite small.

- timing-related or error tests where some long (multi-second) delay
is a necessary part of the setup.

IME that's almost always a sign that the test wouldn't be stable anyway.

Greetings,

Andres Freund