stats test intermittent failure

Started by Melanie Plagemanover 2 years ago5 messages
#1Melanie Plageman
melanieplageman@gmail.com
1 attachment(s)

Hi,

Jeff pointed out that one of the pg_stat_io tests has failed a few times
over the past months (here on morepork [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2023-06-16%2018%3A30%3A32 and more recently here on
francolin [2]https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=francolin&dt=2023-07-07%2018%3A43%3A57&stg=recovery-check).

Failing test diff for those who prefer not to scroll:

+++ /home/bf/bf-build/francolin/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/stats.out
   2023-07-07 18:48:25.976313231 +0000
@@ -1415,7 +1415,7 @@
        :io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
  ?column? | ?column?
 ----------+----------
- t        | t
+ t        | f

My theory about the test failure is that, when there is enough demand
for shared buffers, the flapping test fails because it expects buffer
access strategy *reuses* and concurrent queries already flushed those
buffers before they could be reused. Attached is a patch which I think
will fix the test while keeping some code coverage. If we count
evictions and reuses together, those should have increased.

- Melanie

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=morepork&dt=2023-06-16%2018%3A30%3A32
[2]: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=francolin&dt=2023-07-07%2018%3A43%3A57&stg=recovery-check

Attachments:

v1-0001-Fix-pg_stat_io-buffer-reuse-test-instability.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Fix-pg_stat_io-buffer-reuse-test-instability.patchDownload
From cf41551431751a2b03e0fc08b7296301fba81992 Mon Sep 17 00:00:00 2001
From: Melanie Plageman <melanieplageman@gmail.com>
Date: Mon, 10 Jul 2023 13:57:36 -0400
Subject: [PATCH v1] Fix pg_stat_io buffer reuse test instability

The stats regression test attempts to ensure that Buffer Access Strategy
"reuses" are being counted in pg_stat_io by vacuuming a table which is
larger than the size of the strategy ring. However, when shared buffers
are in sufficiently high demand, another backend could evict one of the
blocks in the strategy ring before the first backend has a chance to
reuse the buffer. The backend using the strategy would then evict a
block from another shared buffer and add that buffer to the strategy
ring. This counts as an eviction and not a reuse in pg_stat_io. Count
both evictions and reuses in the test to ensure it does not fail
incorrectly.

Reported-by: Jeff Davis
---
 src/test/regress/expected/stats.out | 30 +++++++++++++++++++----------
 src/test/regress/sql/stats.sql      | 19 +++++++++++-------
 2 files changed, 32 insertions(+), 17 deletions(-)

diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 8e63340782..2ea6bbca7a 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1385,7 +1385,11 @@ SELECT :io_sum_local_new_tblspc_writes > :io_sum_local_after_writes;
 
 RESET temp_buffers;
 -- Test that reuse of strategy buffers and reads of blocks into these reused
--- buffers while VACUUMing are tracked in pg_stat_io.
+-- buffers while VACUUMing are tracked in pg_stat_io. If there is sufficient
+-- demand for shared buffers from concurrent queries, some blocks may be
+-- evicted from the strategy ring before they can be reused. In such cases
+-- this, the backend will evict a block from a shared buffer outside of the
+-- ring and add it to the ring. This is considered an eviction and not a reuse.
 -- Set wal_skip_threshold smaller than the expected size of
 -- test_io_vac_strategy so that, even if wal_level is minimal, VACUUM FULL will
 -- fsync the newly rewritten test_io_vac_strategy instead of writing it to WAL.
@@ -1393,15 +1397,15 @@ RESET temp_buffers;
 -- shared buffers -- preventing us from testing BAS_VACUUM BufferAccessStrategy
 -- reads.
 SET wal_skip_threshold = '1 kB';
-SELECT sum(reuses) AS reuses, sum(reads) AS reads
+SELECT sum(reuses) AS reuses, sum(reads) AS reads, sum(evictions) AS evictions
   FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_before_
 CREATE TABLE test_io_vac_strategy(a int, b int) WITH (autovacuum_enabled = 'false');
 INSERT INTO test_io_vac_strategy SELECT i, i from generate_series(1, 4500)i;
 -- Ensure that the next VACUUM will need to perform IO by rewriting the table
 -- first with VACUUM (FULL).
 VACUUM (FULL) test_io_vac_strategy;
--- Use the minimum BUFFER_USAGE_LIMIT to cause reuses with the smallest table
--- possible.
+-- Use the minimum BUFFER_USAGE_LIMIT to cause reuses or evictions with the
+-- smallest table possible.
 VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) test_io_vac_strategy;
 SELECT pg_stat_force_next_flush();
  pg_stat_force_next_flush 
@@ -1409,13 +1413,19 @@ SELECT pg_stat_force_next_flush();
  
 (1 row)
 
-SELECT sum(reuses) AS reuses, sum(reads) AS reads
+SELECT sum(reuses) AS reuses, sum(reads) AS reads, sum(evictions) AS evictions
   FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_after_
-SELECT :io_sum_vac_strategy_after_reads > :io_sum_vac_strategy_before_reads,
-       :io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
- ?column? | ?column? 
-----------+----------
- t        | t
+SELECT :io_sum_vac_strategy_after_reads > :io_sum_vac_strategy_before_reads;
+ ?column? 
+----------
+ t
+(1 row)
+
+SELECT (:io_sum_vac_strategy_after_reuses + :io_sum_vac_strategy_after_evictions) >
+  (:io_sum_vac_strategy_before_reuses + :io_sum_vac_strategy_before_evictions);
+ ?column? 
+----------
+ t
 (1 row)
 
 RESET wal_skip_threshold;
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index fddf5a8277..f398d8c5f5 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -678,7 +678,11 @@ SELECT :io_sum_local_new_tblspc_writes > :io_sum_local_after_writes;
 RESET temp_buffers;
 
 -- Test that reuse of strategy buffers and reads of blocks into these reused
--- buffers while VACUUMing are tracked in pg_stat_io.
+-- buffers while VACUUMing are tracked in pg_stat_io. If there is sufficient
+-- demand for shared buffers from concurrent queries, some blocks may be
+-- evicted from the strategy ring before they can be reused. In such cases
+-- this, the backend will evict a block from a shared buffer outside of the
+-- ring and add it to the ring. This is considered an eviction and not a reuse.
 
 -- Set wal_skip_threshold smaller than the expected size of
 -- test_io_vac_strategy so that, even if wal_level is minimal, VACUUM FULL will
@@ -687,21 +691,22 @@ RESET temp_buffers;
 -- shared buffers -- preventing us from testing BAS_VACUUM BufferAccessStrategy
 -- reads.
 SET wal_skip_threshold = '1 kB';
-SELECT sum(reuses) AS reuses, sum(reads) AS reads
+SELECT sum(reuses) AS reuses, sum(reads) AS reads, sum(evictions) AS evictions
   FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_before_
 CREATE TABLE test_io_vac_strategy(a int, b int) WITH (autovacuum_enabled = 'false');
 INSERT INTO test_io_vac_strategy SELECT i, i from generate_series(1, 4500)i;
 -- Ensure that the next VACUUM will need to perform IO by rewriting the table
 -- first with VACUUM (FULL).
 VACUUM (FULL) test_io_vac_strategy;
--- Use the minimum BUFFER_USAGE_LIMIT to cause reuses with the smallest table
--- possible.
+-- Use the minimum BUFFER_USAGE_LIMIT to cause reuses or evictions with the
+-- smallest table possible.
 VACUUM (PARALLEL 0, BUFFER_USAGE_LIMIT 128) test_io_vac_strategy;
 SELECT pg_stat_force_next_flush();
-SELECT sum(reuses) AS reuses, sum(reads) AS reads
+SELECT sum(reuses) AS reuses, sum(reads) AS reads, sum(evictions) AS evictions
   FROM pg_stat_io WHERE context = 'vacuum' \gset io_sum_vac_strategy_after_
-SELECT :io_sum_vac_strategy_after_reads > :io_sum_vac_strategy_before_reads,
-       :io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
+SELECT :io_sum_vac_strategy_after_reads > :io_sum_vac_strategy_before_reads;
+SELECT (:io_sum_vac_strategy_after_reuses + :io_sum_vac_strategy_after_evictions) >
+  (:io_sum_vac_strategy_before_reuses + :io_sum_vac_strategy_before_evictions);
 RESET wal_skip_threshold;
 
 -- Test that extends done by a CTAS, which uses a BAS_BULKWRITE
-- 
2.37.2

#2Alexander Lakhin
exclusion@gmail.com
In reply to: Melanie Plageman (#1)
1 attachment(s)
Re: stats test intermittent failure

Hi Melanie,

10.07.2023 21:35, Melanie Plageman wrote:

Hi,

Jeff pointed out that one of the pg_stat_io tests has failed a few times
over the past months (here on morepork [1] and more recently here on
francolin [2]).

Failing test diff for those who prefer not to scroll:

+++ /home/bf/bf-build/francolin/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/stats.out
2023-07-07 18:48:25.976313231 +0000
@@ -1415,7 +1415,7 @@
:io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
?column? | ?column?
----------+----------
- t        | t
+ t        | f

My theory about the test failure is that, when there is enough demand
for shared buffers, the flapping test fails because it expects buffer
access strategy *reuses* and concurrent queries already flushed those
buffers before they could be reused. Attached is a patch which I think
will fix the test while keeping some code coverage. If we count
evictions and reuses together, those should have increased.

I managed to reproduce that failure with the attached patch applied
(on master) and with the following script (that effectively multiplies
probability of the failure by 360):
CPPFLAGS="-O0" ./configure -q --enable-debug --enable-cassert --enable-tap-tests  && make  -s -j`nproc` && make -s check
-C src/test/recovery
mkdir -p src/test/recovery00/t
cp src/test/recovery/t/027_stream_regress.pl src/test/recovery00/t/
cp src/test/recovery/Makefile src/test/recovery00/
for ((i=1;i<=9;i++)); do cp -r src/test/recovery00/ src/test/recovery$i; done

for ((i=1;i<=10;i++)); do echo "iteration $i"; NO_TEMP_INSTALL=1 parallel --halt now,fail=1 -j9 --linebuffer --tag make
-s check -C src/test/{} ::: recovery1 recovery2 recovery3 recovery4 recovery5 recovery6 recovery7 recovery8 recovery9 ||
break; done

Without your patch, I get:
iteration 2
...
recovery5       #   Failed test 'regression tests pass'
recovery5       #   at t/027_stream_regress.pl line 92.
recovery5       #          got: '256'
recovery5       #     expected: '0'
...
src/test/recovery5/tmp_check/log/regress_log_027_stream_regress contains:
--- .../src/test/regress/expected/stats.out  2023-07-11 20:05:10.536059706 +0300
+++ .../src/test/recovery5/tmp_check/results/stats.out 2023-07-11 20:30:46.790551305 +0300
@@ -1418,7 +1418,7 @@
         :io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
   ?column? | ?column?
  ----------+----------
- t        | t
+ t        | f
  (1 row)

With your patch applied, 10 iterations performed successfully for me.
So it looks like your theory and your fix are correct.

Best regards,
Alexander

Attachments:

regress-repeat-stats-line.patchtext/x-patch; charset=UTF-8; name=regress-repeat-stats-line.patchDownload
diff --git a/src/test/regress/expected/compression.out b/src/test/regress/expected/compression.out
index 834b7555cb..6a84afd819 100644
--- a/src/test/regress/expected/compression.out
+++ b/src/test/regress/expected/compression.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS cmdata, cmdata1, cmmove1, cmmove3, cmdata2, cmmove2, cmpart, cmpart2 CASCADE;
+RESET client_min_messages;
 \set HIDE_TOAST_COMPRESSION false
 -- ensure we get stable results regardless of installation's default
 SET default_toast_compression = 'pglz';
diff --git a/src/test/regress/expected/compression_1.out b/src/test/regress/expected/compression_1.out
index ddcd137c49..7ff83d2472 100644
--- a/src/test/regress/expected/compression_1.out
+++ b/src/test/regress/expected/compression_1.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS cmdata, cmdata1, cmmove1, cmmove3, cmdata2, cmmove2, cmpart, cmpart2 CASCADE;
+RESET client_min_messages;
 \set HIDE_TOAST_COMPRESSION false
 -- ensure we get stable results regardless of installation's default
 SET default_toast_compression = 'pglz';
diff --git a/src/test/regress/expected/explain.out b/src/test/regress/expected/explain.out
index 1aca77491b..d8ad54ec5d 100644
--- a/src/test/regress/expected/explain.out
+++ b/src/test/regress/expected/explain.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP FUNCTION IF EXISTS explain_filter, explain_filter_to_json CASCADE;
+RESET client_min_messages;
 --
 -- EXPLAIN
 --
diff --git a/src/test/regress/expected/indexing.out b/src/test/regress/expected/indexing.out
index 3e5645c2ab..cbd5d3affb 100644
--- a/src/test/regress/expected/indexing.out
+++ b/src/test/regress/expected/indexing.out
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS idxpart, idxpart_another, covidxpart, covidxpart3, covidxpart4 CASCADE;
+DROP SCHEMA IF EXISTS regress_indexing CASCADE;
+RESET client_min_messages;
 -- Creating an index on a partitioned table makes the partitions
 -- automatically get the index
 create table idxpart (a int, b int, c text) partition by range (a);
diff --git a/src/test/regress/expected/memoize.out b/src/test/regress/expected/memoize.out
index f5202430f8..8577958685 100644
--- a/src/test/regress/expected/memoize.out
+++ b/src/test/regress/expected/memoize.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP FUNCTION IF EXISTS explain_memoize CASCADE;
+RESET client_min_messages;
 -- Perform tests on the Memoize node.
 -- The cache hits/misses/evictions from the Memoize node can vary between
 -- machines.  Let's just replace the number with an 'N'.  In order to allow us
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 1b900fddf8..317e4c561b 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS pagg_tab, pagg_tab1, pagg_tab2, pagg_tab_m, pagg_tab_ml, pagg_tab_para CASCADE;
+RESET client_min_messages;
 --
 -- PARTITION_AGGREGATE
 -- Test partitionwise aggregation on partitioned tables
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 762cc8e53b..894f1db5ff 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS prt1, prt2, prt1_e, prt2_e, prt1_m, prt2_m, plt1, plt2, plt1_e, pht1, pht2, pht1_e, prt1_l, prt2_l, prt1_n, prt2_n, prt3_n, prt4_n, alpha, beta CASCADE;
+RESET client_min_messages;
 --
 -- PARTITION_JOIN
 -- Test partitionwise join between partitioned tables
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 2abf759385..05fbb653f7 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS part, part_rev, xy_1, boolvalues CASCADE;
+DROP FUNCTION IF EXISTS explain_parallel_append CASCADE;
+RESET client_min_messages;
 --
 -- Test partitioning planner code
 --
@@ -4031,40 +4035,3 @@ drop table rp_prefix_test1;
 drop table rp_prefix_test2;
 drop table rp_prefix_test3;
 drop table hp_prefix_test;
---
--- Check that gen_partprune_steps() detects self-contradiction from clauses
--- regardless of the order of the clauses (Here we use a custom operator to
--- prevent the equivclass.c machinery from reordering the clauses)
---
-create operator === (
-   leftarg = int4,
-   rightarg = int4,
-   procedure = int4eq,
-   commutator = ===,
-   hashes
-);
-create operator class part_test_int4_ops2
-for type int4
-using hash as
-operator 1 ===,
-function 2 part_hashint4_noop(int4, int8);
-create table hp_contradict_test (a int, b int) partition by hash (a part_test_int4_ops2, b part_test_int4_ops2);
-create table hp_contradict_test_p1 partition of hp_contradict_test for values with (modulus 2, remainder 0);
-create table hp_contradict_test_p2 partition of hp_contradict_test for values with (modulus 2, remainder 1);
-explain (costs off) select * from hp_contradict_test where a is null and a === 1 and b === 1;
-        QUERY PLAN        
---------------------------
- Result
-   One-Time Filter: false
-(2 rows)
-
-explain (costs off) select * from hp_contradict_test where a === 1 and b === 1 and a is null;
-        QUERY PLAN        
---------------------------
- Result
-   One-Time Filter: false
-(2 rows)
-
-drop table hp_contradict_test;
-drop operator class part_test_int4_ops2 using hash;
-drop operator ===(int4, int4);
diff --git a/src/test/regress/expected/reloptions.out b/src/test/regress/expected/reloptions.out
index b6aef6f654..4ad006477a 100644
--- a/src/test/regress/expected/reloptions.out
+++ b/src/test/regress/expected/reloptions.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS reloptions_test CASCADE;
+RESET client_min_messages;
 -- Simple create
 CREATE TABLE reloptions_test(i INT) WITH (FiLLFaCToR=30,
 	autovacuum_enabled = false, autovacuum_analyze_scale_factor = 0.2);
diff --git a/src/test/regress/expected/stats.out b/src/test/regress/expected/stats.out
index 8e63340782..7cc9f95674 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS stats_test_tab1, test_io_vac_strategy, test_io_bulkwrite_strategy CASCADE;
+RESET client_min_messages;
 --
 -- Test cumulative stats system
 --
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index cf46fa3359..84780095e5 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -120,6 +120,45 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # this group.
 # ----------
 test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
 
 # event_trigger cannot run concurrently with any test that runs DDL
 # oidjoins is read-only, though, and should run late for best coverage
diff --git a/src/test/regress/sql/compression.sql b/src/test/regress/sql/compression.sql
index 7179a5002e..006e2cb129 100644
--- a/src/test/regress/sql/compression.sql
+++ b/src/test/regress/sql/compression.sql
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS cmdata, cmdata1, cmmove1, cmmove3, cmdata2, cmmove2, cmpart, cmpart2 CASCADE;
+RESET client_min_messages;
+
 \set HIDE_TOAST_COMPRESSION false
 
 -- ensure we get stable results regardless of installation's default
diff --git a/src/test/regress/sql/explain.sql b/src/test/regress/sql/explain.sql
index b6b7beab27..d6316d1295 100644
--- a/src/test/regress/sql/explain.sql
+++ b/src/test/regress/sql/explain.sql
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP FUNCTION IF EXISTS explain_filter, explain_filter_to_json CASCADE;
+RESET client_min_messages;
+
 --
 -- EXPLAIN
 --
diff --git a/src/test/regress/sql/indexing.sql b/src/test/regress/sql/indexing.sql
index d6e5a06d95..ae74d61114 100644
--- a/src/test/regress/sql/indexing.sql
+++ b/src/test/regress/sql/indexing.sql
@@ -1,3 +1,8 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS idxpart, idxpart_another, covidxpart, covidxpart3, covidxpart4 CASCADE;
+DROP SCHEMA IF EXISTS regress_indexing CASCADE;
+RESET client_min_messages;
+
 -- Creating an index on a partitioned table makes the partitions
 -- automatically get the index
 create table idxpart (a int, b int, c text) partition by range (a);
diff --git a/src/test/regress/sql/memoize.sql b/src/test/regress/sql/memoize.sql
index 29ab1ea62d..365fbcde94 100644
--- a/src/test/regress/sql/memoize.sql
+++ b/src/test/regress/sql/memoize.sql
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP FUNCTION IF EXISTS explain_memoize CASCADE;
+RESET client_min_messages;
+
 -- Perform tests on the Memoize node.
 
 -- The cache hits/misses/evictions from the Memoize node can vary between
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee24..7daa82e71f 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS pagg_tab, pagg_tab1, pagg_tab2, pagg_tab_m, pagg_tab_ml, pagg_tab_para CASCADE;
+RESET client_min_messages;
+
 --
 -- PARTITION_AGGREGATE
 -- Test partitionwise aggregation on partitioned tables
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 9e16f1ca55..0761de6152 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS prt1, prt2, prt1_e, prt2_e, prt1_m, prt2_m, plt1, plt2, plt1_e, pht1, pht2, pht1_e, prt1_l, prt2_l, prt1_n, prt2_n, prt3_n, prt4_n, alpha, beta CASCADE;
+RESET client_min_messages;
+
 --
 -- PARTITION_JOIN
 -- Test partitionwise join between partitioned tables
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index d1c60b8fe9..bc49ea979b 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1,3 +1,8 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS part, part_rev, xy_1, boolvalues CASCADE;
+DROP FUNCTION IF EXISTS explain_parallel_append CASCADE;
+RESET client_min_messages;
+
 --
 -- Test partitioning planner code
 --
@@ -1193,33 +1198,3 @@ drop table rp_prefix_test1;
 drop table rp_prefix_test2;
 drop table rp_prefix_test3;
 drop table hp_prefix_test;
-
---
--- Check that gen_partprune_steps() detects self-contradiction from clauses
--- regardless of the order of the clauses (Here we use a custom operator to
--- prevent the equivclass.c machinery from reordering the clauses)
---
-
-create operator === (
-   leftarg = int4,
-   rightarg = int4,
-   procedure = int4eq,
-   commutator = ===,
-   hashes
-);
-create operator class part_test_int4_ops2
-for type int4
-using hash as
-operator 1 ===,
-function 2 part_hashint4_noop(int4, int8);
-
-create table hp_contradict_test (a int, b int) partition by hash (a part_test_int4_ops2, b part_test_int4_ops2);
-create table hp_contradict_test_p1 partition of hp_contradict_test for values with (modulus 2, remainder 0);
-create table hp_contradict_test_p2 partition of hp_contradict_test for values with (modulus 2, remainder 1);
-
-explain (costs off) select * from hp_contradict_test where a is null and a === 1 and b === 1;
-explain (costs off) select * from hp_contradict_test where a === 1 and b === 1 and a is null;
-
-drop table hp_contradict_test;
-drop operator class part_test_int4_ops2 using hash;
-drop operator ===(int4, int4);
diff --git a/src/test/regress/sql/reloptions.sql b/src/test/regress/sql/reloptions.sql
index 4252b0202f..5a2de67b36 100644
--- a/src/test/regress/sql/reloptions.sql
+++ b/src/test/regress/sql/reloptions.sql
@@ -1,3 +1,6 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS reloptions_test CASCADE;
+RESET client_min_messages;
 
 -- Simple create
 CREATE TABLE reloptions_test(i INT) WITH (FiLLFaCToR=30,
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index fddf5a8277..cdc93c0672 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -1,3 +1,7 @@
+SET client_min_messages TO 'warning';
+DROP TABLE IF EXISTS stats_test_tab1, test_io_vac_strategy, test_io_bulkwrite_strategy CASCADE;
+RESET client_min_messages;
+
 --
 -- Test cumulative stats system
 --
#3Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Melanie Plageman (#1)
Re: stats test intermittent failure

Hi,

On Tue, Jul 11, 2023 at 3:35 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:

Hi,

Jeff pointed out that one of the pg_stat_io tests has failed a few times
over the past months (here on morepork [1] and more recently here on
francolin [2]).

Failing test diff for those who prefer not to scroll:

+++ /home/bf/bf-build/francolin/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/stats.out
2023-07-07 18:48:25.976313231 +0000
@@ -1415,7 +1415,7 @@
:io_sum_vac_strategy_after_reuses > :io_sum_vac_strategy_before_reuses;
?column? | ?column?
----------+----------
- t        | t
+ t        | f

My theory about the test failure is that, when there is enough demand
for shared buffers, the flapping test fails because it expects buffer
access strategy *reuses* and concurrent queries already flushed those
buffers before they could be reused. Attached is a patch which I think
will fix the test while keeping some code coverage. If we count
evictions and reuses together, those should have increased.

Yeah, I've not reproduced this issue but it's possible. IIUC if we get
the buffer from the ring, we count an I/O as "reuse" even if the
buffer has already been flushed/replaced. However, if the buffer in
the ring is pinned by other backends, we end up evicting a buffer from
outside of the ring and adding it to the buffer, which is counted as
"eviction".

Regarding the patch, I have a comment:

 -- Test that reuse of strategy buffers and reads of blocks into these reused
--- buffers while VACUUMing are tracked in pg_stat_io.
+-- buffers while VACUUMing are tracked in pg_stat_io. If there is sufficient
+-- demand for shared buffers from concurrent queries, some blocks may be
+-- evicted from the strategy ring before they can be reused. In such cases
+-- this, the backend will evict a block from a shared buffer outside of the
+-- ring and add it to the ring. This is considered an eviction and not a reuse.

The new comment seems not to be accurate if my understanding is
correct. How about the following?

Test that reuse of strategy buffers and reads of blocks into these
reused buffers while VACUUMing are tracked in pg_stat_io. If there is
sufficient demand for shared buffers from concurrent queries, some
buffers may be pinned by other backends before they can be reused. In
such cases, the backend will evict a buffer from a shared buffer
outside of the ring and add it to the ring. This is considered an
eviction and not a reuse.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#4Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#3)
Re: stats test intermittent failure

Hi,

On 2023-07-31 21:03:07 +0900, Masahiko Sawada wrote:

Regarding the patch, I have a comment:

-- Test that reuse of strategy buffers and reads of blocks into these reused
--- buffers while VACUUMing are tracked in pg_stat_io.
+-- buffers while VACUUMing are tracked in pg_stat_io. If there is sufficient
+-- demand for shared buffers from concurrent queries, some blocks may be
+-- evicted from the strategy ring before they can be reused. In such cases
+-- this, the backend will evict a block from a shared buffer outside of the
+-- ring and add it to the ring. This is considered an eviction and not a reuse.

The new comment seems not to be accurate if my understanding is correct. How
about the following?

Test that reuse of strategy buffers and reads of blocks into these
reused buffers while VACUUMing are tracked in pg_stat_io. If there is
sufficient demand for shared buffers from concurrent queries, some
buffers may be pinned by other backends before they can be reused. In
such cases, the backend will evict a buffer from a shared buffer
outside of the ring and add it to the ring. This is considered an
eviction and not a reuse.

I integrated the suggested change of the comment and tweaked it a bit
more. And finally pushed the fix.

Sorry that it took so long.

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: stats test intermittent failure

Andres Freund <andres@anarazel.de> writes:

I integrated the suggested change of the comment and tweaked it a bit
more. And finally pushed the fix.

This failure was originally seen on v16 (that is, pre-fork). Shouldn't
the fix be back-patched?

regards, tom lane