gcov coverage data not full with immediate stop

Started by Alexander Lakhinover 5 years ago12 messages

exclusion@gmail.com

over 5 years ago

1 attachment(s)

Hello hackers,

I've found that gcov coverage data miss some information when a postgres
node stopped in 'immediate' mode.
For example, on the master branch:
make coverage-clean; time make check -C src/test/recovery/; make
coverage-html
generates a coverage report with 106193 lines/6318 functions for me
(`make check` takes 1m34s).
But with the attached simple patch I get a coverage report with 106540
lines/6332 functions (and `make check` takes 2m5s).
(IMO, the slowdown of the test is significant.)

So if we want to make the coverage reports more precise, I see the three
ways:
1. Change the stop mode in teardown_node to fast (probably only when
configured with --enable-coverage);
2. Explicitly stop nodes in TAP tests (where it's important) -- seems
too tedious and troublesome;
3. Explicitly call __gcov_flush in SIGQUIT handler (quickdie)?

Best regards,
Alexander

Attachments:

pgnode-stop.patchtext/x-patch; charset=UTF-8; name=pgnode-stop.patchDownload

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index af13c320e9c..99c1e85e1a3 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -1242,7 +1242,7 @@ sub teardown_node
 {
 	my $self = shift;
 
-	$self->stop('immediate');
+	$self->stop();
 	return;
 }

Alvaro Herrera

alvherre@2ndquadrant.com

over 5 years ago

In reply to: Alexander Lakhin (#1)

2 attachment(s)

Re: gcov coverage data not full with immediate stop

(Strangely, I was just thinking about these branches of mine as I
closed my week last Friday...)

On 2020-May-10, Alexander Lakhin wrote:

So if we want to make the coverage reports more precise, I see the three
ways:
1. Change the stop mode in teardown_node to fast (probably only when
configured with --enable-coverage);
2. Explicitly stop nodes in TAP tests (where it's important) -- seems
too tedious and troublesome;
3. Explicitly call __gcov_flush in SIGQUIT handler (quickdie)?

I tried your idea 3 a long time ago and my experiments didn't show an
increase in coverage [1]/messages/by-id/20190531170503.GA24057@alvherre.pgsql. But I like this idea the best, and maybe I
did something wrong. Attached is the patch I had (on top of
fc115d0f9fc6), but I don't know if it still applies.

(The second attachment is another branch I had on this, I don't remember
why; that one was on top of 438e51987dcc. The curious thing is that I
didn't add the __gcov_flush to quickdie in this one. Maybe what we need
is a mix of both.)

I think we should definitely get this fixed for pg13 ...

[1]: /messages/by-id/20190531170503.GA24057@alvherre.pgsql

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-add-gcov_flush-call-in-quickdie.patchtext/x-diff; charset=us-asciiDownload

From 9803714aa0e493c603e04e282a241cfa89507da3 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Fri, 31 May 2019 13:09:46 -0400
Subject: [PATCH] add gcov_flush call in quickdie

---
 configure                   | 4 ++++
 configure.in                | 4 +++-
 src/backend/tcop/postgres.c | 8 ++++++++
 src/include/pg_config.h.in  | 3 +++
 4 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index fd61bf6472..c0cf19d662 100755
--- a/configure
+++ b/configure
@@ -3516,6 +3516,10 @@ fi
 if test -z "$GENHTML"; then
   as_fn_error $? "genhtml not found" "$LINENO" 5
 fi
+
+$as_echo "#define USE_GCOV_COVERAGE 1" >>confdefs.h
+
+
       ;;
     no)
       :
diff --git a/configure.in b/configure.in
index 4586a1716c..21465bbaa6 100644
--- a/configure.in
+++ b/configure.in
@@ -222,7 +222,9 @@ fi
 PGAC_PATH_PROGS(GENHTML, genhtml)
 if test -z "$GENHTML"; then
   AC_MSG_ERROR([genhtml not found])
-fi])
+fi
+AC_DEFINE([USE_GCOV_COVERAGE], 1, [Define to use gcov coverage support stuff])
+])
 AC_SUBST(enable_coverage)
 
 #
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..a483eba454 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2729,6 +2729,14 @@ quickdie(SIGNAL_ARGS)
 			 errhint("In a moment you should be able to reconnect to the"
 					 " database and repeat your command.")));
 
+#ifdef USE_GCOV_COVERAGE
+	/*
+	 * We still want to flush coverage data down to disk, which gcov's atexit
+	 * callback would do, but we're preventing that below.
+	 */
+	__gcov_flush();
+#endif
+
 	/*
 	 * We DO NOT want to run proc_exit() or atexit() callbacks -- we're here
 	 * because shared memory may be corrupted, so we don't want to try to
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 6cd4cfed0a..5fb1a545ed 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -916,6 +916,9 @@
    (--enable-float8-byval) */
 #undef USE_FLOAT8_BYVAL
 
+/* Define to use gcov coverage support stuff */
+#undef USE_GCOV_COVERAGE
+
 /* Define to build with ICU support. (--with-icu) */
 #undef USE_ICU
 
-- 
2.20.1

0001-gcov_flush-stuff.patchtext/x-diff; charset=us-asciiDownload

From 853d4c901ce4b53285b43ddf44cb567f774a2dd8 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: Fri, 31 May 2019 11:20:23 -0400
Subject: [PATCH] gcov_flush stuff

---
 config/c-compiler.m4 |  15 +++++++
 configure            | 102 +++++++++++++++++++++++++++++++++++++++++++
 configure.in         |   9 ++++
 3 files changed, 126 insertions(+)

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 71b645839d..0a73fd4624 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -394,6 +394,21 @@ AC_DEFINE_UNQUOTED(AS_TR_CPP([HAVE$1]), 1,
                    [Define to 1 if your compiler understands $1.])
 fi])# PGAC_CHECK_BUILTIN_FUNC
 
+AC_DEFUN([PGAC_CHECK_BUILTIN_FUNC0],
+[AC_CACHE_CHECK(for $1, pgac_cv$1,
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([
+int
+call$1()
+{
+    return $1();
+}], [])],
+[pgac_cv$1=yes],
+[pgac_cv$1=no])])
+if test x"${pgac_cv$1}" = xyes ; then
+AC_DEFINE_UNQUOTED(AS_TR_CPP([HAVE$1]), 1,
+                   [Define to 1 if your compiler understands $1.])
+fi])# PGAC_CHECK_BUILTIN_FUNC0
+
 
 
 # PGAC_PROG_VARCC_VARFLAGS_OPT
diff --git a/configure b/configure
index fd61bf6472..28ebab733f 100755
--- a/configure
+++ b/configure
@@ -11915,6 +11915,69 @@ fi
   fi
 fi
 
+if test "$enable_coverage" = yes ; then
+  if test "$PORTNAME" != "win32"; then
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing __gcov_flush" >&5
+$as_echo_n "checking for library containing __gcov_flush... " >&6; }
+if ${ac_cv_search___gcov_flush+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+#ifdef __cplusplus
+extern "C"
+#endif
+char __gcov_flush ();
+int
+main ()
+{
+return __gcov_flush ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' gcov; do
+  if test -z "$ac_lib"; then
+    ac_res="none required"
+  else
+    ac_res=-l$ac_lib
+    LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"; then :
+  ac_cv_search___gcov_flush=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext
+  if ${ac_cv_search___gcov_flush+:} false; then :
+  break
+fi
+done
+if ${ac_cv_search___gcov_flush+:} false; then :
+
+else
+  ac_cv_search___gcov_flush=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_search___gcov_flush" >&5
+$as_echo "$ac_cv_search___gcov_flush" >&6; }
+ac_res=$ac_cv_search___gcov_flush
+if test "$ac_res" != no; then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+else
+  as_fn_error $? "could not find __gcov_flush" "$LINENO" 5
+fi
+
+	fi
+fi
+
 if test "$with_openssl" = yes ; then
     if test "$PORTNAME" != "win32"; then
      { $as_echo "$as_me:${as_lineno-$LINENO}: checking for CRYPTO_new_ex_data in -lcrypto" >&5
@@ -15420,6 +15483,45 @@ _ACEOF
 
 fi
 
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for __gcov_flush" >&5
+$as_echo_n "checking for __gcov_flush... " >&6; }
+if ${pgac_cv__gcov_flush+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+int
+call__gcov_flush()
+{
+    return __gcov_flush(x);
+}
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv__gcov_flush=yes
+else
+  pgac_cv__gcov_flush=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv__gcov_flush" >&5
+$as_echo "$pgac_cv__gcov_flush" >&6; }
+if test x"${pgac_cv__gcov_flush}" = xyes ; then
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE__GCOV_FLUSH 1
+_ACEOF
+
+fi
+
 ac_fn_c_check_func "$LINENO" "fseeko" "ac_cv_func_fseeko"
 if test "x$ac_cv_func_fseeko" = xyes; then :
   $as_echo "#define HAVE_FSEEKO 1" >>confdefs.h
diff --git a/configure.in b/configure.in
index 4586a1716c..bc97398bda 100644
--- a/configure.in
+++ b/configure.in
@@ -1197,6 +1197,13 @@ if test "$with_gssapi" = yes ; then
   fi
 fi
 
+if test "$enable_coverage" = yes ; then
+  if test "$PORTNAME" != "win32"; then
+    AC_SEARCH_LIBS(__gcov_flush, gcov, [],
+				   [AC_MSG_ERROR([could not find __gcov_flush])])
+	fi
+fi
+
 if test "$with_openssl" = yes ; then
   dnl Order matters!
   if test "$PORTNAME" != "win32"; then
@@ -1650,6 +1657,8 @@ PGAC_CHECK_BUILTIN_FUNC([__builtin_clz], [unsigned int x])
 PGAC_CHECK_BUILTIN_FUNC([__builtin_ctz], [unsigned int x])
 PGAC_CHECK_BUILTIN_FUNC([__builtin_popcount], [unsigned int x])
 
+PGAC_CHECK_BUILTIN_FUNC0([__gcov_flush])
+
 AC_REPLACE_FUNCS(fseeko)
 case $host_os in
 	# NetBSD uses a custom fseeko/ftello built on fsetpos/fgetpos
-- 
2.20.1

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Alvaro Herrera (#2)

Re: gcov coverage data not full with immediate stop

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

On 2020-May-10, Alexander Lakhin wrote:

3. Explicitly call __gcov_flush in SIGQUIT handler (quickdie)?

I tried your idea 3 a long time ago and my experiments didn't show an
increase in coverage [1]. But I like this idea the best, and maybe I
did something wrong. Attached is the patch I had (on top of
fc115d0f9fc6), but I don't know if it still applies.

Putting ill-defined, not-controlled-by-us work into a quickdie signal
handler sounds like a really bad idea to me. Maybe it's all right,
since presumably it would only appear in specialized test builds; but
even so, how much could you trust the results?

I think we should definitely get this fixed for pg13 ...

-1 for shoving in such a thing so late in the cycle. We've survived
without it for years, we can do so for a few months more.

regards, tom lane

Alexander Lakhin

exclusion@gmail.com

over 5 years ago

In reply to: Alvaro Herrera (#2)

1 attachment(s)

Re: gcov coverage data not full with immediate stop

Hello Alvaro,
11.05.2020 06:42, Alvaro Herrera wrote:

(Strangely, I was just thinking about these branches of mine as I
closed my week last Friday...)

On 2020-May-10, Alexander Lakhin wrote:

So if we want to make the coverage reports more precise, I see the three
ways:
1. Change the stop mode in teardown_node to fast (probably only when
configured with --enable-coverage);
2. Explicitly stop nodes in TAP tests (where it's important) -- seems
too tedious and troublesome;
3. Explicitly call __gcov_flush in SIGQUIT handler (quickdie)?

I tried your idea 3 a long time ago and my experiments didn't show an
increase in coverage [1]. But I like this idea the best, and maybe I
did something wrong. Attached is the patch I had (on top of
fc115d0f9fc6), but I don't know if it still applies.

Thanks for the reference to that discussion and your patch.
As I see the issue with that patch is that quickdie() is not the only
SIGQUIT handler. When a backend is interrupted with SIGQUIT, it's
exiting in SignalHandlerForCrashExit().
In fact if I only add __gcov_flush() in SignalHandlerForCrashExit(), it
raises test coverage for `make check -C src/test/recovery/` from
106198 lines/6319 functions
to
106420 lines/6328 functions

It's not yet clear to me what happens when __gcov_flush() called inside
__gcov_flush().
The test coverage changes to:
108432 lines/5417 functions
(number of function calls decreased)
And for example in coverage/src/backend/utils/cache/catcache.c.gcov.html
I see
ï¿½ï¿½ï¿½ï¿½ 147ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ 8 : int2eqfast(Datum a, Datum b)
...
ï¿½ï¿½ï¿½ï¿½ 153ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ 0 : int2hashfast(Datum datum)
but without __gcov_flush in quickdie() we have:
ï¿½ï¿½ï¿½ï¿½ 147ï¿½ï¿½ï¿½ï¿½ï¿½ï¿½ 78038 : int2eqfast(Datum a, Datum b)
...
ï¿½ï¿½ï¿½ï¿½ 153ï¿½ï¿½ï¿½ï¿½ï¿½ 255470 : int2hashfast(Datum datum)
So it needs more investigation.

But I can confirm that calling __gcov_flush() in
SignalHandlerForCrashExit() really improves a code coverage report.
I tried to develop a test to elevate a coverage for gist:
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html
(Please look at the attached test if it could be interesting.)
and came to this issue with a coverage. I tried to play with
GCOV_PREFIX, but without luck.
Yesterday I found the more recent discussion:
/messages/by-id/44ecae53-9861-71b7-1d43-4658acc52519@2ndquadrant.com
(where probably the same problem came out).

Finally I've managed to get an expected coverage when I performed
$node_standby->stop() (but __gcov_flush() in SignalHandlerForCrashExit()
helps too).

Best regards,
Alexander

Attachments:

021_indexes-test.patchtext/x-patch; charset=UTF-8; name=021_indexes-test.patchDownload

diff --git a/src/test/recovery/t/021_indexes.pl b/src/test/recovery/t/021_indexes.pl
new file mode 100644
index 0000000000..902133d55a
--- /dev/null
+++ b/src/test/recovery/t/021_indexes.pl
@@ -0,0 +1,102 @@
+# Test testing indexes replication
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 1;
+
+# Initialize master node
+my $node_master = get_new_node('master');
+$node_master->init(allows_streaming => 1);
+$node_master->start;
+
+# Add a table for gist index check
+$node_master->safe_psql('postgres',
+    qq{
+create table gist_point_tbl(id int, p point);
+insert into gist_point_tbl(id, p)
+    select g, point(g*10, g*10) from generate_series(10000, 11000) g;
+create index gist_pointidx on gist_point_tbl using gist(p);});
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_master->backup($backup_name);
+
+# Create streaming standby from backup
+my $node_standby = get_new_node('standby');
+$node_standby->init_from_backup($node_master, $backup_name,
+    has_streaming => 1);
+$node_standby->start;
+
+$node_master->safe_psql('postgres', qq{
+create temp table gist_point_tbl_t(id int, p point);
+create index gist_pointidx_t on gist_point_tbl_t using gist(p);
+insert into gist_point_tbl_t(id, p)
+    select g, point(g*10, g*10) from generate_series(1, 1000) g;
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+explain (costs off)
+select p from gist_point_tbl_t where p <@ box(point(0,0), point(100, 100));
+});
+
+$node_master->safe_psql('postgres', qq{
+create unlogged table gist_point_tbl_u(id int, p point);
+create index gist_pointidx_u on gist_point_tbl_u using gist(p);
+insert into gist_point_tbl_u(id, p)
+    select g, point(g*10, g*10) from generate_series(1, 1000) g;
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+explain (costs off)
+select p from gist_point_tbl_u where p <@ box(point(0,0), point(100, 100));
+});
+
+$node_master->safe_psql('postgres', qq{
+insert into gist_point_tbl (id, p)
+    select g, point(g*10, g*10) from generate_series(1, 1000) g;});
+
+$node_master->safe_psql('postgres', "delete from gist_point_tbl where id < 500");
+
+$node_master->safe_psql('postgres', qq{
+create table test as (select x, box(point(x, x),point(x, x))
+   from generate_series(1,4000000) as x);
+
+create index test_idx on test using gist (box);
+
+set enable_seqscan TO false;
+set enable_bitmapscan TO false;
+
+delete from test where box && box(point(0,0), point(100000,100000));
+
+-- This query invokes gistkillitems()
+select count(box) from test where box && box(point(0,0), point(100000,100000));
+});
+
+$node_master->safe_psql('postgres', qq{
+insert into test
+    select x, box(point(x, x),point(x*3, x*3))
+    from generate_series(1,200000) as x;});
+
+$node_master->safe_psql('postgres', qq{
+update test set box = box where x<2000000; vacuum test;});
+$node_master->safe_psql('postgres', qq{
+delete from test where x<1000000; vacuum test;});
+$node_master->safe_psql('postgres', qq{
+insert into test
+    select x, box(point(x, x),point(x, x))
+    from generate_series(1,1000000) as x;});
+
+$node_master->safe_psql('postgres', qq{
+insert into gist_point_tbl (id, p)
+    select g,        point(g*5, g*5) from generate_series(1, 10000) g;});
+
+$node_master->wait_for_catchup($node_standby, 'replay');
+
+my $result = $node_standby->safe_psql('postgres', qq{
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+explain (costs off)
+select p from gist_point_tbl where p <@ box(point(0,0), point(100, 100));});
+ok($result =~ /^Index Only Scan using gist_pointidx/, "gist index used on a standby");
+
+#$node_standby->stop();
+#$node_master->stop();

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

over 5 years ago

In reply to: Alexander Lakhin (#4)

Re: gcov coverage data not full with immediate stop

On Mon, May 11, 2020 at 2:30 PM Alexander Lakhin <exclusion@gmail.com> wrote:

But I can confirm that calling __gcov_flush() in SignalHandlerForCrashExit() really improves a code coverage report.

Finally I've managed to get an expected coverage when I performed $node_standby->stop() (but __gcov_flush() in SignalHandlerForCrashExit() helps too).

What happens if a coverage tool other than gcov is used? From that
perspective, it's better to perform a clean shutdown in the TAP tests
instead of immediate if that's possible.

--
Best Wishes,
Ashutosh Bapat

Robert Haas

robertmhaas@gmail.com

over 5 years ago

In reply to: Tom Lane (#3)

Re: gcov coverage data not full with immediate stop

On Mon, May 11, 2020 at 12:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think we should definitely get this fixed for pg13 ...

-1 for shoving in such a thing so late in the cycle. We've survived
without it for years, we can do so for a few months more.

I agree, but also, we should start thinking about when to branch. I,
too, have patches that aren't critical enough to justify pushing them
post-freeze, but which are still good improvements that I'd like to
get into the tree. I'm queueing them right now to avoid the risk of
destabilizing things, but that generates more work, for me and for
other people, if their patches force me to rebase or the other way
around. I know there's always a concern with removing the focus on
release N too soon, but the open issues list is 3 items long right
now, and 2 of those look like preexisting issues, not new problems in
v13. Meanwhile, we have 20+ active committers.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Robert Haas (#6)

Re: gcov coverage data not full with immediate stop

Robert Haas <robertmhaas@gmail.com> writes:

I agree, but also, we should start thinking about when to branch. I,
too, have patches that aren't critical enough to justify pushing them
post-freeze, but which are still good improvements that I'd like to
get into the tree. I'm queueing them right now to avoid the risk of
destabilizing things, but that generates more work, for me and for
other people, if their patches force me to rebase or the other way
around. I know there's always a concern with removing the focus on
release N too soon, but the open issues list is 3 items long right
now, and 2 of those look like preexisting issues, not new problems in
v13. Meanwhile, we have 20+ active committers.

Yeah. Traditionally we've waited till the start of the next commitfest
(which I'm assuming is July 1, for lack of an Ottawa dev meeting to decide
differently). But it seems like things are slow enough that perhaps
we could branch earlier, like June 1, and give the committers a chance
to deal with some of their own stuff before starting the CF.

This is the wrong thread to be debating that in, though. Also I wonder
if this is really RMT turf?

regards, tom lane

Robert Haas

robertmhaas@gmail.com

over 5 years ago

In reply to: Tom Lane (#7)

Re: gcov coverage data not full with immediate stop

On Mon, May 11, 2020 at 4:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

This is the wrong thread to be debating that in, though.

True.

Also I wonder if this is really RMT turf?

I think it is, but the RMT is permitted -- even encouraged -- to
consider the views of non-RMT members when making its decision.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Peter Geoghegan

pg@bowt.ie

over 5 years ago

In reply to: Tom Lane (#7)

Re: gcov coverage data not full with immediate stop

On Mon, May 11, 2020 at 1:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah. Traditionally we've waited till the start of the next commitfest
(which I'm assuming is July 1, for lack of an Ottawa dev meeting to decide
differently). But it seems like things are slow enough that perhaps
we could branch earlier, like June 1, and give the committers a chance
to deal with some of their own stuff before starting the CF.

The RMT discussed this question informally yesterday. The consensus is
that we should wait and see what the early feedback from Beta 1 is
before making a final decision. An earlier June 1 branch date is an
idea that certainly has some merit, but we'd like to put off making a
final decision on that for at least another week, and possibly as long
as two weeks.

Can that easily be accommodated?

--
Peter Geoghegan

#10

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Peter Geoghegan (#9)

Re: gcov coverage data not full with immediate stop

Peter Geoghegan <pg@bowt.ie> writes:

On Mon, May 11, 2020 at 1:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yeah. Traditionally we've waited till the start of the next commitfest
(which I'm assuming is July 1, for lack of an Ottawa dev meeting to decide
differently). But it seems like things are slow enough that perhaps
we could branch earlier, like June 1, and give the committers a chance
to deal with some of their own stuff before starting the CF.

The RMT discussed this question informally yesterday. The consensus is
that we should wait and see what the early feedback from Beta 1 is
before making a final decision. An earlier June 1 branch date is an
idea that certainly has some merit, but we'd like to put off making a
final decision on that for at least another week, and possibly as long
as two weeks.

Can that easily be accommodated?

There's no real lead time needed AFAICS: when we are ready to branch,
we can just do it. So sure, let's wait till the end of May to decide.
If things look bad then, we could reconsider again mid-June.

regards, tom lane

#11

Michael Paquier

michael@paquier.xyz

over 5 years ago

In reply to: Ashutosh Bapat (#5)

Re: gcov coverage data not full with immediate stop

On Mon, May 11, 2020 at 05:56:33PM +0530, Ashutosh Bapat wrote:

What happens if a coverage tool other than gcov is used? From that
perspective, it's better to perform a clean shutdown in the TAP tests
instead of immediate if that's possible.

Nope, as that's the fastest path we have to shut down any remaining
nodes at the end of a test per the END{} block at the end of
PostgresNode.pm, and I would rather keep it this way because people
tend to like keeping around a lot of clusters alive at the end of any
new test added and shutdown checkpoints are not free either even if
fsync is enforced to off in the tests.

I think that a solution turning around __gcov_flush() could be the
best deal we have, as discussed last year in the thread Álvaro quoted
upthread, and I would vote for waiting until v14 opens for business
before merging something we consider worth it.
--
Michael

#12

Peter Geoghegan

pg@bowt.ie

over 5 years ago

In reply to: Tom Lane (#10)

Re: gcov coverage data not full with immediate stop

On Tue, May 12, 2020 at 10:10 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Can that easily be accommodated?

There's no real lead time needed AFAICS: when we are ready to branch,
we can just do it. So sure, let's wait till the end of May to decide.
If things look bad then, we could reconsider again mid-June.

Great. Let's review it at the end of May, before actually branching.

--
Peter Geoghegan