incorrect xlog.c coverage report

Started by Peter Eisentrautabout 7 years ago18 messages

peter.eisentraut@2ndquadrant.com

about 7 years ago

I noticed some strangeness in the test coverage reporting. For example,
in
https://coverage.postgresql.org/src/backend/access/transam/xlog.c.gcov.html
in the function readRecoveryCommandFile(), most of the branches parsing
the individual recovery options (recovery_target_xid,
recovery_target_time, etc.) are shown as never hit, even though there
are explicit tests for this in
src/test/recovery/t/003_recovery_targets.pl. I tried this locally and
also with -O0 just in case, but I get the same results. Any ideas?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Peter Eisentraut (#1)

Re: incorrect xlog.c coverage report

On 2018-Nov-20, Peter Eisentraut wrote:

I noticed some strangeness in the test coverage reporting. For example,
in
https://coverage.postgresql.org/src/backend/access/transam/xlog.c.gcov.html
in the function readRecoveryCommandFile(), most of the branches parsing
the individual recovery options (recovery_target_xid,
recovery_target_time, etc.) are shown as never hit, even though there
are explicit tests for this in
src/test/recovery/t/003_recovery_targets.pl. I tried this locally and
also with -O0 just in case, but I get the same results. Any ideas?

I've posted this before, but as a reminder, the coverage script does this:

./configure --enable-depend --enable-coverage --enable-tap-tests --enable-nls --with-python --with-perl --with-tcl --with-openssl --with-libxml --with-ldap --with-pam >> $LOG 2>&1

# run tests
make -j4 >> $LOG 2>&1
make -j4 -C contrib >> $LOG 2>&1
make check-world PG_TEST_EXTRA="ssl ldap" >> $LOG 2>&1
make coverage-html >> $LOG 2>&1

I certainly expect that this would run the recovery test. And today's log
file has this:

make -C recovery check
make[2]: Entering directory '/home/coverage/pgsrc/pgsql/src/test/recovery'
for extra in contrib/test_decoding; do make -C '../../..'/$extra DESTDIR='/home/coverage/pgsrc/pgsql'/tmp_install install >>'/home/coverage/pgsrc/pgsql'/tmp_install/log/install.log || exit; done
rm -rf '/home/coverage/pgsrc/pgsql/src/test/recovery'/tmp_check
/bin/mkdir -p '/home/coverage/pgsrc/pgsql/src/test/recovery'/tmp_check
cd . && TESTDIR='/home/coverage/pgsrc/pgsql/src/test/recovery' PATH="/home/coverage/pgsrc/pgsql/tmp_install/usr/local/pgsql/bin:$PATH" LD_LIBRARY_PATH="/home/coverage/pgsrc/pgsql/tmp_install/usr/local/pgsql/lib" PGPORT='65432' PG_REGRESS='/home/coverage/pgsrc/pgsql/src/test/recovery/../../../src/test/regress/pg_regress' /usr/bin/prove -I ../../../src/test/perl/ -I . t/*.pl
t/001_stream_rep.pl .................. ok
t/002_archiving.pl ................... ok
t/003_recovery_targets.pl ............ ok
t/004_timeline_switch.pl ............. ok
t/005_replay_delay.pl ................ ok
t/006_logical_decoding.pl ............ ok
t/007_sync_rep.pl .................... ok
t/008_fsm_truncation.pl .............. ok
t/009_twophase.pl .................... ok
t/010_logical_decoding_timelines.pl .. ok
t/011_crash_recovery.pl .............. ok
t/012_subtransactions.pl ............. ok
t/013_crash_restart.pl ............... ok
t/014_unlogged_reinit.pl ............. ok
t/015_promotion_pages.pl ............. ok
All tests successful.
Files=15, Tests=140, 100 wallclock secs ( 0.08 usr 0.02 sys + 18.70 cusr 8.57 csys = 27.37 CPU)
Result: PASS
make[2]: Leaving directory '/home/coverage/pgsrc/pgsql/src/test/recovery'

Also:

$ cd src/backend/access/transam
$ ls -l xlog.*
-rw-r--r-- 1 coverage coverage 397975 Nov 19 06:01 xlog.c
-rw-r--r-- 1 coverage coverage 34236 Nov 20 09:20 xlog.gcda
-rw-r--r-- 1 coverage coverage 257868 Nov 20 09:02 xlog.gcno
-rw-r--r-- 1 coverage coverage 527576 Nov 20 09:02 xlog.o

Not sure what to make of this.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Alvaro Herrera (#2)

Re: incorrect xlog.c coverage report

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2018-Nov-20, Peter Eisentraut wrote:

I noticed some strangeness in the test coverage reporting.

Not sure what to make of this.

What platform and compiler do you run the coverage build on?

(I'm remembering that gcov was pretty nearly entirely broken on
Fedora awhile back. Maybe it's just a little broken on whatever
you're using.)

regards, tom lane

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Tom Lane (#3)

Re: incorrect xlog.c coverage report

On 2018-Nov-20, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2018-Nov-20, Peter Eisentraut wrote:

I noticed some strangeness in the test coverage reporting.

Not sure what to make of this.

What platform and compiler do you run the coverage build on?

(I'm remembering that gcov was pretty nearly entirely broken on
Fedora awhile back. Maybe it's just a little broken on whatever
you're using.)

This is Debian 9.6. gcov says:

$ gcov --version
gcov (Debian 6.3.0-18+deb9u1) 6.3.0 20170516

This matches the gcc version string exactly.

ccache is not being used.

configure: using compiler=gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
configure: using CFLAGS=-Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fprofile-arcs -ftest-coverage
configure: using CPPFLAGS= -D_GNU_SOURCE -I/usr/include/libxml2
configure: using LDFLAGS= -Wl,--as-needed

I wondered if perhaps gcov's data files are ending in the wrong place
for some reason, but I don't know how to verify that. I also wondered
if test results would maybe not be saved for the postmaster executable
when run under the TAP test framework ... not sure about this.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Alvaro Herrera (#4)

1 attachment(s)

Re: incorrect xlog.c coverage report

On Wed, Nov 21, 2018 at 12:50 AM Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I noticed some strangeness in the test coverage reporting. For example,
in
https://coverage.postgresql.org/src/backend/access/transam/xlog.c.gcov.html
in the function readRecoveryCommandFile(), most of the branches parsing
the individual recovery options (recovery_target_xid,
recovery_target_time, etc.) are shown as never hit, even though there
are explicit tests for this in
src/test/recovery/t/003_recovery_targets.pl. I tried this locally and
also with -O0 just in case, but I get the same results. Any ideas?

I've looked into this issue and this happens on my environment (CentOS
6.9 and gcob 4.4.7) as well. ISTM the cause would related to the
immediate shutdown mode we're using in test_recovery_standby.
Interestingly in my environment with the attached one-line patch the
coverage reports the branches parsing the individual recovery options
correctly.

If my investigation is correct, all tests where use an immediate
shutdown might not counted by gcov.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

gcov.patchapplication/octet-stream; name=gcov.patchDownload

diff --git a/src/test/recovery/t/003_recovery_targets.pl b/src/test/recovery/t/003_recovery_targets.pl
index e867479..cdc10dd 100644
--- a/src/test/recovery/t/003_recovery_targets.pl
+++ b/src/test/recovery/t/003_recovery_targets.pl
@@ -40,7 +40,7 @@ sub test_recovery_standby
 	is($result, qq($num_rows), "check standby content for $test_name");
 
 	# Stop standby node
-	$node_standby->teardown_node;
+	$node_standby->stop;
 
 	return;
 }

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Masahiko Sawada (#5)

Re: incorrect xlog.c coverage report

On 2018-Nov-21, Masahiko Sawada wrote:

I've looked into this issue and this happens on my environment (CentOS
6.9 and gcob 4.4.7) as well. ISTM the cause would related to the
immediate shutdown mode we're using in test_recovery_standby.
Interestingly in my environment with the attached one-line patch the
coverage reports the branches parsing the individual recovery options
correctly.

If my investigation is correct, all tests where use an immediate
shutdown might not counted by gcov.

Confirmed these results here, thanks for the research.

I think we should change all calls of ->teardown_node to ->stop(),
except the one in the END block, and look for places which are currently
relying too much on END (i.e. add more ->stop() calls where needed).

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

about 7 years ago

In reply to: Alvaro Herrera (#6)

Re: incorrect xlog.c coverage report

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2018-Nov-21, Masahiko Sawada wrote:

I've looked into this issue and this happens on my environment (CentOS
6.9 and gcob 4.4.7) as well. ISTM the cause would related to the
immediate shutdown mode we're using in test_recovery_standby.

Doh, of course.

I think we should change all calls of ->teardown_node to ->stop(),
except the one in the END block, and look for places which are currently
relying too much on END (i.e. add more ->stop() calls where needed).

Hm. We probably don't want to have zero coverage of immediate stop mode,
though I agree we could cut it way back.

regards, tom lane

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Tom Lane (#7)

Re: incorrect xlog.c coverage report

On Wed, Nov 21, 2018 at 01:20:48PM -0500, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I think we should change all calls of ->teardown_node to ->stop(),
except the one in the END block, and look for places which are currently
relying too much on END (i.e. add more ->stop() calls where needed).

Hm. We probably don't want to have zero coverage of immediate stop mode,
though I agree we could cut it way back.

The root of the issue is that gcov is not able to write out the gcda
file when Postgres is stopped in immediate mode? There are some code
paths in the recovery tests where teardown_node is used on purpose (see
for example 009_twophase.pl).

At a lower level, teardown_node is actually just doing
$node->stop('immediate'). Would it be perhaps less confusing when
reading the tests to remove teardown_node and use $node->stop
everywhere, with the stop mode wanted? This needs a careful lookup
though. I am also wondering about the performance impact in the time it
takes to run the tests in serializable fashion. fsync is disabled so
the shutdown checkpoint would have less impact, still that's worth
checking.
--
Michael

Thomas Munro

thomas.munro@enterprisedb.com

about 7 years ago

In reply to: Michael Paquier (#8)

Re: incorrect xlog.c coverage report

On Thu, Nov 22, 2018 at 2:22 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Nov 21, 2018 at 01:20:48PM -0500, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I think we should change all calls of ->teardown_node to ->stop(),
except the one in the END block, and look for places which are currently
relying too much on END (i.e. add more ->stop() calls where needed).

Hm. We probably don't want to have zero coverage of immediate stop mode,
though I agree we could cut it way back.

The root of the issue is that gcov is not able to write out the gcda
file when Postgres is stopped in immediate mode? There are some code
paths in the recovery tests where teardown_node is used on purpose (see
for example 009_twophase.pl).

So the issue is that quickdie() uses _exit(), so the GCOV atexit()
handler (or whatever similar mechanism they use for that) doesn't run,
right?
Presumably you could add your own call to __gcov_flush() in
quickdie(), so that we get GCOV data but no other atexit()-like stuff.
I see that some people advocate doing that in signal handlers, but I
don't know if it's really safe. If that is somehow magically OK,
you'd probably also need the chdir() hack from proc_exit() to get
per-pid files.

--
Thomas Munro
http://www.enterprisedb.com

#10

Masahiko Sawada

sawada.mshk@gmail.com

about 7 years ago

In reply to: Thomas Munro (#9)

Re: incorrect xlog.c coverage report

On Thu, Nov 22, 2018 at 10:43 AM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

On Thu, Nov 22, 2018 at 2:22 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Nov 21, 2018 at 01:20:48PM -0500, Tom Lane wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I think we should change all calls of ->teardown_node to ->stop(),
except the one in the END block, and look for places which are currently
relying too much on END (i.e. add more ->stop() calls where needed).

Hm. We probably don't want to have zero coverage of immediate stop mode,
though I agree we could cut it way back.

The root of the issue is that gcov is not able to write out the gcda
file when Postgres is stopped in immediate mode? There are some code
paths in the recovery tests where teardown_node is used on purpose (see
for example 009_twophase.pl).

So the issue is that quickdie() uses _exit(), so the GCOV atexit()
handler (or whatever similar mechanism they use for that) doesn't run,
right?

I think so too. Since gcov uses atexit(3) handler the exiting by
exit(3) or from main() is required.

Presumably you could add your own call to __gcov_flush() in
quickdie(), so that we get GCOV data but no other atexit()-like stuff.
I see that some people advocate doing that in signal handlers, but I
don't know if it's really safe. If that is somehow magically OK,
you'd probably also need the chdir() hack from proc_exit() to get
per-pid files.

That's probably a good idea, I'm also not sure if it's really safe
though. An alternative approach could be that we can do $node->restart
after recovered from $node->teardown_node() to write gcda file surely,
although it would make the tests hard to read.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#11

Michael Paquier

michael@paquier.xyz

about 7 years ago

In reply to: Masahiko Sawada (#10)

Re: incorrect xlog.c coverage report

On Thu, Nov 22, 2018 at 10:56:39AM +0900, Masahiko Sawada wrote:

On Thu, Nov 22, 2018 at 10:43 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Presumably you could add your own call to __gcov_flush() in
quickdie(), so that we get GCOV data but no other atexit()-like stuff.
I see that some people advocate doing that in signal handlers, but I
don't know if it's really safe. If that is somehow magically OK,
you'd probably also need the chdir() hack from proc_exit() to get
per-pid files.

That's probably a good idea, I'm also not sure if it's really safe
though. An alternative approach could be that we can do $node->restart
after recovered from $node->teardown_node() to write gcda file surely,
although it would make the tests hard to read.

Thanks for looking at the details around that. I'd prefer much if we
have a solution like what's outline here because we should really try to
have coverage even for code paths which involve an immediate shutdown
(mainly for recovery). Manipulating the tests to get a better coverage
feels more like a band-aid solution, and does not help folks with custom
TAP tests in their plugins.
--
Michael

#12

Alvaro Herrera

alvherre@2ndquadrant.com

about 7 years ago

In reply to: Michael Paquier (#11)

Re: incorrect xlog.c coverage report

On 2018-Nov-22, Michael Paquier wrote:

On Thu, Nov 22, 2018 at 10:56:39AM +0900, Masahiko Sawada wrote:

On Thu, Nov 22, 2018 at 10:43 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Presumably you could add your own call to __gcov_flush() in
quickdie(), so that we get GCOV data but no other atexit()-like stuff.
I see that some people advocate doing that in signal handlers, but I
don't know if it's really safe. If that is somehow magically OK,
you'd probably also need the chdir() hack from proc_exit() to get
per-pid files.

That's probably a good idea, I'm also not sure if it's really safe
though. An alternative approach could be that we can do $node->restart
after recovered from $node->teardown_node() to write gcda file surely,
although it would make the tests hard to read.

Thanks for looking at the details around that. I'd prefer much if we
have a solution like what's outline here because we should really try to
have coverage even for code paths which involve an immediate shutdown
(mainly for recovery). Manipulating the tests to get a better coverage
feels more like a band-aid solution, and does not help folks with custom
TAP tests in their plugins.

On the contrary, I think we shouldn't mess with the exit sequence.
Today we have three shutdown modes -- smart, fast, immediate. If we add
stuff to the exit sequence of the immediate mode, we have four shutdown
modes: those three, plus an actual server crash which would be different
from immediate. I'd rather not do that, because we'll then grow a
totally untested code path.

Anyway I now think this problem can be fixed by careful changing of
teardown_node() into stop('fast') in some places. The places for which
it actually matters that a shutdown is immediate are not really
interested with the code that executes in the server that shuts down --
they are interested in the code run by the server that *doesn't* shut
down (the replica), or the server after it restarts (and which we can
shut down cleanly afterwards). No need to mess with the backend exit
code path ISTM.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#13

Andres Freund

andres@anarazel.de

about 7 years ago

In reply to: Alvaro Herrera (#12)

Re: incorrect xlog.c coverage report

On 2018-11-21 23:45:01 -0300, Alvaro Herrera wrote:

On 2018-Nov-22, Michael Paquier wrote:

On Thu, Nov 22, 2018 at 10:56:39AM +0900, Masahiko Sawada wrote:

On Thu, Nov 22, 2018 at 10:43 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Presumably you could add your own call to __gcov_flush() in
quickdie(), so that we get GCOV data but no other atexit()-like stuff.
I see that some people advocate doing that in signal handlers, but I
don't know if it's really safe. If that is somehow magically OK,
you'd probably also need the chdir() hack from proc_exit() to get
per-pid files.

That's probably a good idea, I'm also not sure if it's really safe
though. An alternative approach could be that we can do $node->restart
after recovered from $node->teardown_node() to write gcda file surely,
although it would make the tests hard to read.

Thanks for looking at the details around that. I'd prefer much if we
have a solution like what's outline here because we should really try to
have coverage even for code paths which involve an immediate shutdown
(mainly for recovery). Manipulating the tests to get a better coverage
feels more like a band-aid solution, and does not help folks with custom
TAP tests in their plugins.

On the contrary, I think we shouldn't mess with the exit sequence.
Today we have three shutdown modes -- smart, fast, immediate. If we add
stuff to the exit sequence of the immediate mode, we have four shutdown
modes: those three, plus an actual server crash which would be different
from immediate. I'd rather not do that, because we'll then grow a
totally untested code path.

Anyway I now think this problem can be fixed by careful changing of
teardown_node() into stop('fast') in some places. The places for which
it actually matters that a shutdown is immediate are not really
interested with the code that executes in the server that shuts down --
they are interested in the code run by the server that *doesn't* shut
down (the replica), or the server after it restarts (and which we can
shut down cleanly afterwards). No need to mess with the backend exit
code path ISTM.

I don't find this particularly convincing. Coverage ought to indicate
code coverage, not some random subset based on what kind of shutdown
mode testing uses.

Yes, it's not particularly safe to do things during an immediate
shutdown, but doing so only when computing coverage, e.g. when some
environment variable is set, seems pretty reasonable. The likelihood of
regular failures due to that seem exceedingly slim.

Greetings,

Andres Freund

#14

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Michael Paquier (#11)

Re: incorrect xlog.c coverage report

On 2018-Nov-22, Michael Paquier wrote:

On Thu, Nov 22, 2018 at 10:56:39AM +0900, Masahiko Sawada wrote:

On Thu, Nov 22, 2018 at 10:43 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Presumably you could add your own call to __gcov_flush() in
quickdie(), so that we get GCOV data but no other atexit()-like stuff.
I see that some people advocate doing that in signal handlers, but I
don't know if it's really safe. If that is somehow magically OK,
you'd probably also need the chdir() hack from proc_exit() to get
per-pid files.

That's probably a good idea, I'm also not sure if it's really safe
though. An alternative approach could be that we can do $node->restart
after recovered from $node->teardown_node() to write gcda file surely,
although it would make the tests hard to read.

Thanks for looking at the details around that. I'd prefer much if we
have a solution like what's outline here because we should really try to
have coverage even for code paths which involve an immediate shutdown
(mainly for recovery). Manipulating the tests to get a better coverage
feels more like a band-aid solution, and does not help folks with custom
TAP tests in their plugins.

I've just realized that we didn't do anything about this (see line 5380
in https://coverage.postgresql.org/src/backend/access/transam/xlog.c.gcov.html)
, and we should. I still prefer the solution I proposed (which is to
edit the test files to avoid immediate shutdown in certain places), but
I admit that adding __gcov_flush() to quickdie() seems to have gotten
more votes.

Are there objections to doing that now on the master branch?

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#15

Michael Paquier

michael@paquier.xyz

over 6 years ago

In reply to: Alvaro Herrera (#14)

Re: incorrect xlog.c coverage report

On Wed, May 29, 2019 at 12:09:08PM -0400, Alvaro Herrera wrote:

Are there objections to doing that now on the master branch?

Adding the flush call just on HEAD is fine for me. Not sure that
there is an actual reason to back-patch that.
--
Michael

#16

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Michael Paquier (#15)

1 attachment(s)

Re: incorrect xlog.c coverage report

On 2019-May-30, Michael Paquier wrote:

On Wed, May 29, 2019 at 12:09:08PM -0400, Alvaro Herrera wrote:

Are there objections to doing that now on the master branch?

Adding the flush call just on HEAD is fine for me. Not sure that
there is an actual reason to back-patch that.

Okay ... I did that (patch attached), and while my new __gcov_flush()
shows as covered after I run the src/test/recovery tests, the function I
mentioned earlier (validateRecoveryParameters) is not any more covered
after the patch than it was before. So I'm not sure how useful this
really is. Maybe someone can point to something that this patch is
doing wrong ... or maybe I'm just looking at the wrong report, or this
is not the function that we wanted to add coverage for?

(On a whim I named the symbol USE_GCOV_COVERAGE because we could
theoretically have coverage reports using some other symbol. I suppose
it could very well be just USE_COVERAGE instead.)

gcov after patch:

-: 5379:static void
100: 5380:validateRecoveryParameters(void)
-: 5381:{
100: 5382: if (!ArchiveRecoveryRequested)
81: 5383: return;
-: 5384:
-: 5385: /*
-: 5386: * Check for compulsory parameters
-: 5387: */
19: 5388: if (StandbyModeRequested)
-: 5389: {
19: 5390: if ((PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0) &&
#####: 5391: (recoveryRestoreCommand == NULL || strcmp(recoveryRestoreCommand, "") == 0))
#####: 5392: ereport(WARNING,
-: 5393: (errmsg("specified neither primary_conninfo nor restore_command"),
-: 5394: errhint("The database server will regularly poll the pg_wal subdirectory to check for files placed there.")));
-: 5395: }
-: 5396: else
-: 5397: {
#####: 5398: if (recoveryRestoreCommand == NULL ||
#####: 5399: strcmp(recoveryRestoreCommand, "") == 0)
#####: 5400: ereport(FATAL,
-: 5401: (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-: 5402: errmsg("must specify restore_command when standby mode is not enabled")));
-: 5403: }
-: 5404:
-: 5405: /*
-: 5406: * Override any inconsistent requests. Note that this is a change of
-: 5407: * behaviour in 9.5; prior to this we simply ignored a request to pause if
-: 5408: * hot_standby = off, which was surprising behaviour.
-: 5409: */
38: 5410: if (recoveryTargetAction == RECOVERY_TARGET_ACTION_PAUSE &&
19: 5411: !EnableHotStandby)
#####: 5412: recoveryTargetAction = RECOVERY_TARGET_ACTION_SHUTDOWN;
-: 5413:
-: 5414: /*
-: 5415: * If user specified recovery_target_timeline, validate it or compute the
-: 5416: * "latest" value. We can't do this until after we've gotten the restore
-: 5417: * command and set InArchiveRecovery, because we need to fetch timeline
-: 5418: * history files from the archive.
-: 5419: */
19: 5420: if (recoveryTargetTimeLineGoal == RECOVERY_TARGET_TIMELINE_NUMERIC)
-: 5421: {
#####: 5422: TimeLineID rtli = recoveryTargetTLIRequested;
-: 5423:
-: 5424: /* Timeline 1 does not have a history file, all else should */
#####: 5425: if (rtli != 1 && !existsTimeLineHistory(rtli))
#####: 5426: ereport(FATAL,
-: 5427: (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
-: 5428: errmsg("recovery target timeline %u does not exist",
-: 5429: rtli)));
#####: 5430: recoveryTargetTLI = rtli;
-: 5431: }
19: 5432: else if (recoveryTargetTimeLineGoal == RECOVERY_TARGET_TIMELINE_LATEST)
-: 5433: {
-: 5434: /* We start the "latest" search from pg_control's timeline */
19: 5435: recoveryTargetTLI = findNewestTimeLine(recoveryTargetTLI);
-: 5436: }
-: 5437: else
-: 5438: {
-: 5439: /*
-: 5440: * else we just use the recoveryTargetTLI as already read from
-: 5441: * ControlFile
-: 5442: */
#####: 5443: Assert(recoveryTargetTimeLineGoal == RECOVERY_TARGET_TIMELINE_CONTROLFILE);
-: 5444: }
-: 5445:}

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

gcov-flush.patchtext/x-diff; charset=us-asciiDownload

diff --git a/configure b/configure
index fd61bf6472..c0cf19d662 100755
--- a/configure
+++ b/configure
@@ -3516,6 +3516,10 @@ fi
 if test -z "$GENHTML"; then
   as_fn_error $? "genhtml not found" "$LINENO" 5
 fi
+
+$as_echo "#define USE_GCOV_COVERAGE 1" >>confdefs.h
+
+
       ;;
     no)
       :
diff --git a/configure.in b/configure.in
index 4586a1716c..21465bbaa6 100644
--- a/configure.in
+++ b/configure.in
@@ -222,7 +222,9 @@ fi
 PGAC_PATH_PROGS(GENHTML, genhtml)
 if test -z "$GENHTML"; then
   AC_MSG_ERROR([genhtml not found])
-fi])
+fi
+AC_DEFINE([USE_GCOV_COVERAGE], 1, [Define to use gcov coverage support stuff])
+])
 AC_SUBST(enable_coverage)
 
 #
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..a483eba454 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2729,6 +2729,14 @@ quickdie(SIGNAL_ARGS)
 			 errhint("In a moment you should be able to reconnect to the"
 					 " database and repeat your command.")));
 
+#ifdef USE_GCOV_COVERAGE
+	/*
+	 * We still want to flush coverage data down to disk, which gcov's atexit
+	 * callback would do, but we're preventing that below.
+	 */
+	__gcov_flush();
+#endif
+
 	/*
 	 * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
 	 * because shared memory may be corrupted, so we don't want to try to
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 6cd4cfed0a..5fb1a545ed 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -916,6 +916,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to use gcov coverage support stuff */
+#undef USE_GCOV_COVERAGE
+
 /* Define to build with ICU support. (--with-icu) */
 #undef USE_ICU

#17

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Alvaro Herrera (#16)

Re: incorrect xlog.c coverage report

On 2019-May-31, Alvaro Herrera wrote:

On 2019-May-30, Michael Paquier wrote:

On Wed, May 29, 2019 at 12:09:08PM -0400, Alvaro Herrera wrote:

Are there objections to doing that now on the master branch?

Adding the flush call just on HEAD is fine for me. Not sure that
there is an actual reason to back-patch that.

Okay ... I did that (patch attached), and while my new __gcov_flush()
shows as covered after I run the src/test/recovery tests, the function I
mentioned earlier (validateRecoveryParameters) is not any more covered
after the patch than it was before.

I forgot to mention that this patch produces a new warning:

/pgsql/source/master/src/backend/tcop/postgres.c: In function 'quickdie':
/pgsql/source/master/src/backend/tcop/postgres.c:2737:2: warning: implicit declaration of function '__gcov_flush'; did you mean 'pq_flush'? [-Wimplicit-function-declaration]
__gcov_flush();
^~~~~~~~~~~~
pq_flush

I couldn't find a way to squelch that. gcc devs in their infinite
wisdom don't provide a prototype for it, apparently.

Another thing I thought about was adding a configure test for the
function, but a) apparently the function is very old so it's not
necessary, and b) it fails anyway apparently because of that warning.

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#18

Tom Lane

tgl@sss.pgh.pa.us

over 6 years ago

In reply to: Alvaro Herrera (#17)

Re: incorrect xlog.c coverage report

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

I forgot to mention that this patch produces a new warning:

/pgsql/source/master/src/backend/tcop/postgres.c: In function 'quickdie':
/pgsql/source/master/src/backend/tcop/postgres.c:2737:2: warning: implicit declaration of function '__gcov_flush'; did you mean 'pq_flush'? [-Wimplicit-function-declaration]
__gcov_flush();
^~~~~~~~~~~~
pq_flush

I couldn't find a way to squelch that. gcc devs in their infinite
wisdom don't provide a prototype for it, apparently.

Ugh. So let's supply our own prototype, presumably it's just

extern void __gcov_flush(void);

regards, tom lane