regression coverage gaps for gist and hash indexes

Started by Andres Freundalmost 3 years ago13 messages

andres@anarazel.de

almost 3 years ago

Hi,

I was working on committing patch 0001 from [1]/messages/by-id/da7184cf-c7b9-c333-801e-0e7507a23ddf@gmail.com and was a bit confused about
some of the changes to the WAL format for gist and hash index vacuum. It
looked to me that the changed code just flat out would not work.

Turns out the problem is that we don't reach deletion for hash and gist
vacuum:

gist:

Oh, I see. We apparently don't reach the gist deletion code in the tests:
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#674
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#174

And indeed, if I add an abort() into , it's not reached.

And it's not because tests use a temp table, the caller is also unreachable:
https://coverage.postgresql.org/src/backend/access/gist/gist.c.gcov.html#1643

hash:

And there also are no tests:
https://coverage.postgresql.org/src/backend/access/hash/hashinsert.c.gcov.html#372

I've since looked to other index AMs.

spgist's XLOG_SPGIST_VACUUM_ROOT emitted, but not replayed:
https://coverage.postgresql.org/src/backend/access/spgist/spgvacuum.c.gcov.html#474
https://coverage.postgresql.org/src/backend/access/spgist/spgxlog.c.gcov.html#962

gin's XLOG_GIN_VACUUM_DATA_LEAF_PAGE is not emitted, but only because of a
RelationNeedsWAL() check:
https://coverage.postgresql.org/src/backend/access/gin/gindatapage.c.gcov.html#852

I also looked at heapam:
XLOG_HEAP2_LOCK_UPDATED is not replayed, but emitted:
https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html#5487
https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html#9965

same for XLOG_HEAP2_REWRITE:
https://coverage.postgresql.org/src/backend/access/heap/rewriteheap.c.gcov.html#928
https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html#9975

and XLOG_HEAP_TRUNCATE (ugh, that record is quite the layering violation):
https://coverage.postgresql.org/src/backend/commands/tablecmds.c.gcov.html#2128
https://coverage.postgresql.org/src/backend/access/heap/heapam.c.gcov.html#9918

The fact that those cases aren't replayed isn't too surprising -
XLOG_HEAP2_LOCK_UPDATED is exercised by isolationtester, XLOG_HEAP2_REWRITE,
XLOG_HEAP_TRUNCATE by contrib/test_decoding. Neither is part of
027_stream_regress.pl

The lack of any coverage of hash and gist deletion/vacuum seems quite
concerning to me.

Greetings,

Andres Freund

[1]: /messages/by-id/da7184cf-c7b9-c333-801e-0e7507a23ddf@gmail.com

Andrey Borodin

amborodin86@gmail.com

almost 3 years ago

In reply to: Andres Freund (#1)

Re: regression coverage gaps for gist and hash indexes

Hi!

On Fri, Mar 31, 2023 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:

I was working on committing patch 0001 from [1] and was a bit confused about
some of the changes to the WAL format for gist and hash index vacuum. It
looked to me that the changed code just flat out would not work.

Turns out the problem is that we don't reach deletion for hash and gist
vacuum:

gist:

Oh, I see. We apparently don't reach the gist deletion code in the tests:
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#674
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#174

And indeed, if I add an abort() into , it's not reached.

And it's not because tests use a temp table, the caller is also unreachable:
https://coverage.postgresql.org/src/backend/access/gist/gist.c.gcov.html#1643

GiST logs deletions in gistXLogUpdate(), which is covered.
gistXLogDelete() is only used for cleaning during page splits. I'd
propose refactoring GiST WAL to remove gistXLogDelete() and using
gistXLogUpdate() instead.
However I see that gistXLogPageDelete() is not exercised, and is worth
fixing IMO. Simply adding 10x more data in gist.sql helps, but I think
we can do something more clever...

Best regards, Andrey Borodin.

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrey Borodin (#2)

Re: regression coverage gaps for gist and hash indexes

Andrey Borodin <amborodin86@gmail.com> writes:

On Fri, Mar 31, 2023 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:

Turns out the problem is that we don't reach deletion for hash and gist
vacuum:

GiST logs deletions in gistXLogUpdate(), which is covered.
gistXLogDelete() is only used for cleaning during page splits. I'd
propose refactoring GiST WAL to remove gistXLogDelete() and using
gistXLogUpdate() instead.
However I see that gistXLogPageDelete() is not exercised, and is worth
fixing IMO. Simply adding 10x more data in gist.sql helps, but I think
we can do something more clever...

See also the thread about bug #16329 [1]/messages/by-id/16329-7a6aa9b6fa1118a1@postgresql.org. Alexander promised to look
into improving the test coverage in this area, maybe he can keep an
eye on the WAL logic coverage too.

regards, tom lane

[1]: /messages/by-id/16329-7a6aa9b6fa1118a1@postgresql.org

Alexander Lakhin

exclusion@gmail.com

almost 3 years ago

In reply to: Tom Lane (#3)

Re: regression coverage gaps for gist and hash indexes

31.03.2023 15:55, Tom Lane wrote:

See also the thread about bug #16329 [1]. Alexander promised to look
into improving the test coverage in this area, maybe he can keep an
eye on the WAL logic coverage too.

Yes, I'm going to analyze that area too. Maybe it'll take more time
(a week or two) if I encounter some bugs there (for now I observe anomalies
with gist__int_ops), but I will definitely try to improve the gist testing.

Best regards,
Alexander

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Andrey Borodin (#2)

Re: regression coverage gaps for gist and hash indexes

Hi,

On 2023-03-31 10:45:51 +0300, Andrey Borodin wrote:

On Fri, Mar 31, 2023 at 8:07 AM Andres Freund <andres@anarazel.de> wrote:

I was working on committing patch 0001 from [1] and was a bit confused about
some of the changes to the WAL format for gist and hash index vacuum. It
looked to me that the changed code just flat out would not work.

Turns out the problem is that we don't reach deletion for hash and gist
vacuum:

gist:

Oh, I see. We apparently don't reach the gist deletion code in the tests:
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#674
https://coverage.postgresql.org/src/backend/access/gist/gistxlog.c.gcov.html#174

And indeed, if I add an abort() into , it's not reached.

And it's not because tests use a temp table, the caller is also unreachable:
https://coverage.postgresql.org/src/backend/access/gist/gist.c.gcov.html#1643

GiST logs deletions in gistXLogUpdate(), which is covered.
gistXLogDelete() is only used for cleaning during page splits.

I am not sure what your point here is - deleting entries to prevent a page
split is deleting entries. What am I missing?

propose refactoring GiST WAL to remove gistXLogDelete() and using
gistXLogUpdate() instead.

I think we still would need to have coverage for gistprunepage(), even if
so. So that seems a separate issue.

I think what gist.sql is missing is a combination of delete, index scan (to
mark entries dead), new insertions (to trigger pruning to prevent page
splits).

Greetings,

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Alexander Lakhin (#4)

1 attachment(s)

Re: regression coverage gaps for gist and hash indexes

Hi,

On 2023-03-31 17:00:00 +0300, Alexander Lakhin wrote:

31.03.2023 15:55, Tom Lane wrote:

See also the thread about bug #16329 [1]. Alexander promised to look
into improving the test coverage in this area, maybe he can keep an
eye on the WAL logic coverage too.

Yes, I'm going to analyze that area too. Maybe it'll take more time
(a week or two) if I encounter some bugs there (for now I observe anomalies
with gist__int_ops), but I will definitely try to improve the gist testing.

Because I needed it to verify the changes in the referenced patch, I wrote
tests exercising killtuples based pruning for gist and hash.

For the former I first wrote it in contrib/btree_gist. But that would mean the
recovery side isn't exercised via 027_stream_regress.sql. So I rewrote it to
use point instead, which is a tad more awkward, but...

For now I left the new tests in their own files. But possibly they should be
in gist.sql and hash_index.sql respectively?

As I also wrote at the top of the tests, we can't easily verify that
killtuples pruning has actually happened, nor guarantee that concurrent
activity doesn't prevent it (e.g. autovacuum having a snapshot open or
such). At least not without loosing coverage of WAL logging / replay. To make
it more likely I added them as their own test group.

I don't know if we want the tests in this form, but I do find them useful for
now.

Greetings,

Andres Freund

Attachments:

v1-0001-WIP-Add-test-exercising-hash-and-gist-killtuples.patchtext/x-diff; charset=us-asciiDownload

From 00a8f347fb5a57f3cafe082cdd9c3ea8f6a62053 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Fri, 31 Mar 2023 14:58:50 -0700
Subject: [PATCH v1] WIP: Add test exercising hash and gist killtuples

---
 src/test/regress/expected/gist_killtuples.out | 78 ++++++++++++++++
 .../expected/hash_index_killtuples.out        | 89 +++++++++++++++++++
 src/test/regress/parallel_schedule            |  4 +
 src/test/regress/sql/gist_killtuples.sql      | 49 ++++++++++
 .../regress/sql/hash_index_killtuples.sql     | 57 ++++++++++++
 5 files changed, 277 insertions(+)
 create mode 100644 src/test/regress/expected/gist_killtuples.out
 create mode 100644 src/test/regress/expected/hash_index_killtuples.out
 create mode 100644 src/test/regress/sql/gist_killtuples.sql
 create mode 100644 src/test/regress/sql/hash_index_killtuples.sql

diff --git a/src/test/regress/expected/gist_killtuples.out b/src/test/regress/expected/gist_killtuples.out
new file mode 100644
index 00000000000..18d39c871a1
--- /dev/null
+++ b/src/test/regress/expected/gist_killtuples.out
@@ -0,0 +1,78 @@
+-- Basic test of gist's killtuples / pruning during split implementation. This
+-- is not guaranteed to reach those paths, concurrent activity could prevent
+-- cleanup of dead tuples or such. But it's likely to reach them, which seems
+-- better than nothing.
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+CREATE TABLE test_gist_killtuples (
+	k point,
+	v int DEFAULT 0
+);
+CREATE INDEX ON test_gist_killtuples USING gist(k);
+-- create index entries pointing to deleted tuples
+INSERT INTO test_gist_killtuples(k, v) SELECT point(generate_series(1, 1000), 0), 0;
+-- via deletes
+DELETE FROM test_gist_killtuples WHERE k << point(500, 0);
+-- and updates
+UPDATE test_gist_killtuples SET k = k + point(1, 0), v = 1 WHERE k >> point(500, 0);
+-- make the index see tuples are dead via killtuples
+PREPARE engage_killtuples AS
+SELECT sum(k <-> point(0, 0)) FROM test_gist_killtuples WHERE k >> point(0, 0);
+EXPLAIN (COSTS OFF) EXECUTE engage_killtuples;
+                                   QUERY PLAN                                   
+--------------------------------------------------------------------------------
+ Aggregate
+   ->  Index Only Scan using test_gist_killtuples_k_idx on test_gist_killtuples
+         Index Cond: (k >> '(0,0)'::point)
+(3 rows)
+
+EXECUTE engage_killtuples;
+  sum   
+--------
+ 376250
+(1 row)
+
+-- create new index entries, in the same ranges, this should see the dead entries and prune the pages
+INSERT INTO test_gist_killtuples(k, v) SELECT point(generate_series(1, 300), 0), 2;
+UPDATE test_gist_killtuples SET k = k + point(1, 0), v = 3 WHERE k >> point(600, 0);
+-- do some basic cross checking of index scans vs sequential scan
+-- use prepared statement, so we can explain and execute without repeating the statement
+PREPARE check_query AS SELECT tgk.v, min(tgk.k <-> point(0, 0)), max(tgk.k <-> point(0, 0)), count(*),
+  brute = ROW(tgk.v, min(tgk.k <-> point(0, 0)), max(tgk.k <-> point(0, 0)), count(*)) AS are_equal
+FROM (
+    SELECT v, min(k <-> point(0, 0)), max(k <-> point(0, 0)), count(*)
+    FROM test_gist_killtuples GROUP BY v ORDER BY v
+  ) brute,
+  test_gist_killtuples tgk
+WHERE tgk.k >> point(brute.min - 1, 0) AND tgk.k << point(brute.max + 1, 0)
+GROUP BY brute, tgk.v
+ORDER BY tgk.v;
+EXPLAIN (COSTS OFF) EXECUTE check_query;
+                                                                                       QUERY PLAN                                                                                        
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ GroupAggregate
+   Group Key: tgk.v, brute.*
+   ->  Sort
+         Sort Key: tgk.v, brute.*
+         ->  Nested Loop
+               ->  Subquery Scan on brute
+                     ->  GroupAggregate
+                           Group Key: test_gist_killtuples.v
+                           ->  Sort
+                                 Sort Key: test_gist_killtuples.v
+                                 ->  Seq Scan on test_gist_killtuples
+               ->  Index Scan using test_gist_killtuples_k_idx on test_gist_killtuples tgk
+                     Index Cond: ((k >> point((brute.min - '1'::double precision), '0'::double precision)) AND (k << point((brute.max + '1'::double precision), '0'::double precision)))
+(13 rows)
+
+EXECUTE check_query;
+ v | min | max  | count | are_equal 
+---+-----+------+-------+-----------
+ 0 | 500 |  500 |     1 | t
+ 1 | 502 |  600 |    99 | t
+ 2 |   1 |  300 |   300 | t
+ 3 | 602 | 1002 |   401 | t
+(4 rows)
+
+-- Leave the table/index around, so that an index that has been affected by
+-- killtuples could be seen by amcheck or such.
diff --git a/src/test/regress/expected/hash_index_killtuples.out b/src/test/regress/expected/hash_index_killtuples.out
new file mode 100644
index 00000000000..2487b98b18a
--- /dev/null
+++ b/src/test/regress/expected/hash_index_killtuples.out
@@ -0,0 +1,89 @@
+-- Basic test of gist's killtuples / pruning during split implementation. This
+-- is not guaranteed to reach those paths, concurrent activity could prevent
+-- cleanup of dead tuples or such. But it's likely to reach them, which seems
+-- better than nothing.
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+set enable_material=off;
+-- Hash indexes are fairly dense, 1000 tuples isn't enough to easily reach the
+-- killtuples logic. Might need to be adjusted upward for larger page sizes.
+\set high 2000
+CREATE TABLE test_hash_killtuples (
+	k int8,
+	v int DEFAULT 0
+);
+CREATE INDEX ON test_hash_killtuples USING hash(k);
+-- create index entries pointing to deleted tuples
+INSERT INTO test_hash_killtuples(k, v) SELECT generate_series(1, :high), 0;
+-- via deletes
+DELETE FROM test_hash_killtuples WHERE k < (:high / 2);
+-- and updates
+UPDATE test_hash_killtuples SET k = k + 1, v = 1 WHERE k > (:high / 2);
+---- make the index see tuples are dead via killtuples
+PREPARE engage_killtuples AS
+SELECT count(*) FROM generate_series(1, :high + 1) g(i) WHERE EXISTS (SELECT * FROM test_hash_killtuples thk WHERE thk.k = g.i);
+EXPLAIN (COSTS OFF) EXECUTE engage_killtuples;
+                                     QUERY PLAN                                      
+-------------------------------------------------------------------------------------
+ Aggregate
+   ->  Nested Loop Semi Join
+         ->  Function Scan on generate_series g
+         ->  Index Scan using test_hash_killtuples_k_idx on test_hash_killtuples thk
+               Index Cond: (k = g.i)
+(5 rows)
+
+EXECUTE engage_killtuples;
+ count 
+-------
+  1001
+(1 row)
+
+-- create new index entries, in the same ranges, this should see the dead entries and prune the pages
+INSERT INTO test_hash_killtuples(k, v) SELECT generate_series(1, :high * 0.3), 2;
+UPDATE test_hash_killtuples SET k = k + 1, v = 3 WHERE k > (:high * 0.6);
+-- do some basic cross checking of index scans vs sequential scan
+-- use prepared statement, so we can explain and execute without repeating the statement
+PREPARE check_query AS SELECT thk.v, min(thk.k), max(thk.k), count(*),
+  brute = ROW(thk.v, min(thk.k), max(thk.k), count(*)) AS are_equal
+FROM (
+    SELECT v, min(k), max(k), count(*)
+    FROM test_hash_killtuples GROUP BY v ORDER BY v
+  ) brute,
+  -- hash only does equality lookups, so simulate a range scan using generate_series
+  generate_series(brute.min, brute.max) g(i),
+  test_hash_killtuples thk
+WHERE
+  thk.k = g.i
+GROUP BY brute, thk.v
+ORDER BY thk.v;
+EXPLAIN (COSTS OFF) EXECUTE check_query;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ GroupAggregate
+   Group Key: thk.v, brute.*
+   ->  Sort
+         Sort Key: thk.v, brute.*
+         ->  Nested Loop
+               ->  Nested Loop
+                     ->  Subquery Scan on brute
+                           ->  GroupAggregate
+                                 Group Key: test_hash_killtuples.v
+                                 ->  Sort
+                                       Sort Key: test_hash_killtuples.v
+                                       ->  Seq Scan on test_hash_killtuples
+                     ->  Function Scan on generate_series g
+               ->  Index Scan using test_hash_killtuples_k_idx on test_hash_killtuples thk
+                     Index Cond: (k = g.i)
+(15 rows)
+
+EXECUTE check_query;
+ v | min  | max  | count | are_equal 
+---+------+------+-------+-----------
+ 0 | 1000 | 1000 |     1 | t
+ 1 | 1002 | 1200 |   199 | t
+ 2 |    1 |  600 |   600 | t
+ 3 | 1202 | 2002 |   801 | t
+(4 rows)
+
+-- Leave the table/index around, so that an index that has been affected by
+-- killtuples could be seen by amcheck or such.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 36240356393..c36b310e8f6 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -121,6 +121,10 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # ----------
 test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats
 
+# these tests benefit from running in isolation, to be able to remove dead rows
+test: hash_index_killtuples
+test: gist_killtuples
+
 # event_trigger cannot run concurrently with any test that runs DDL
 # oidjoins is read-only, though, and should run late for best coverage
 test: event_trigger oidjoins
diff --git a/src/test/regress/sql/gist_killtuples.sql b/src/test/regress/sql/gist_killtuples.sql
new file mode 100644
index 00000000000..60664a18eee
--- /dev/null
+++ b/src/test/regress/sql/gist_killtuples.sql
@@ -0,0 +1,49 @@
+-- Basic test of gist's killtuples / pruning during split implementation. This
+-- is not guaranteed to reach those paths, concurrent activity could prevent
+-- cleanup of dead tuples or such. But it's likely to reach them, which seems
+-- better than nothing.
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+
+CREATE TABLE test_gist_killtuples (
+	k point,
+	v int DEFAULT 0
+);
+
+CREATE INDEX ON test_gist_killtuples USING gist(k);
+
+-- create index entries pointing to deleted tuples
+INSERT INTO test_gist_killtuples(k, v) SELECT point(generate_series(1, 1000), 0), 0;
+-- via deletes
+DELETE FROM test_gist_killtuples WHERE k << point(500, 0);
+-- and updates
+UPDATE test_gist_killtuples SET k = k + point(1, 0), v = 1 WHERE k >> point(500, 0);
+
+-- make the index see tuples are dead via killtuples
+PREPARE engage_killtuples AS
+SELECT sum(k <-> point(0, 0)) FROM test_gist_killtuples WHERE k >> point(0, 0);
+EXPLAIN (COSTS OFF) EXECUTE engage_killtuples;
+EXECUTE engage_killtuples;
+
+-- create new index entries, in the same ranges, this should see the dead entries and prune the pages
+INSERT INTO test_gist_killtuples(k, v) SELECT point(generate_series(1, 300), 0), 2;
+UPDATE test_gist_killtuples SET k = k + point(1, 0), v = 3 WHERE k >> point(600, 0);
+
+-- do some basic cross checking of index scans vs sequential scan
+-- use prepared statement, so we can explain and execute without repeating the statement
+PREPARE check_query AS SELECT tgk.v, min(tgk.k <-> point(0, 0)), max(tgk.k <-> point(0, 0)), count(*),
+  brute = ROW(tgk.v, min(tgk.k <-> point(0, 0)), max(tgk.k <-> point(0, 0)), count(*)) AS are_equal
+FROM (
+    SELECT v, min(k <-> point(0, 0)), max(k <-> point(0, 0)), count(*)
+    FROM test_gist_killtuples GROUP BY v ORDER BY v
+  ) brute,
+  test_gist_killtuples tgk
+WHERE tgk.k >> point(brute.min - 1, 0) AND tgk.k << point(brute.max + 1, 0)
+GROUP BY brute, tgk.v
+ORDER BY tgk.v;
+
+EXPLAIN (COSTS OFF) EXECUTE check_query;
+EXECUTE check_query;
+
+-- Leave the table/index around, so that an index that has been affected by
+-- killtuples could be seen by amcheck or such.
diff --git a/src/test/regress/sql/hash_index_killtuples.sql b/src/test/regress/sql/hash_index_killtuples.sql
new file mode 100644
index 00000000000..abd15f9a464
--- /dev/null
+++ b/src/test/regress/sql/hash_index_killtuples.sql
@@ -0,0 +1,57 @@
+-- Basic test of gist's killtuples / pruning during split implementation. This
+-- is not guaranteed to reach those paths, concurrent activity could prevent
+-- cleanup of dead tuples or such. But it's likely to reach them, which seems
+-- better than nothing.
+set enable_seqscan=off;
+set enable_bitmapscan=off;
+set enable_material=off;
+
+-- Hash indexes are fairly dense, 1000 tuples isn't enough to easily reach the
+-- killtuples logic. Might need to be adjusted upward for larger page sizes.
+\set high 2000
+
+CREATE TABLE test_hash_killtuples (
+	k int8,
+	v int DEFAULT 0
+);
+
+CREATE INDEX ON test_hash_killtuples USING hash(k);
+
+-- create index entries pointing to deleted tuples
+INSERT INTO test_hash_killtuples(k, v) SELECT generate_series(1, :high), 0;
+-- via deletes
+DELETE FROM test_hash_killtuples WHERE k < (:high / 2);
+-- and updates
+UPDATE test_hash_killtuples SET k = k + 1, v = 1 WHERE k > (:high / 2);
+
+---- make the index see tuples are dead via killtuples
+PREPARE engage_killtuples AS
+SELECT count(*) FROM generate_series(1, :high + 1) g(i) WHERE EXISTS (SELECT * FROM test_hash_killtuples thk WHERE thk.k = g.i);
+EXPLAIN (COSTS OFF) EXECUTE engage_killtuples;
+EXECUTE engage_killtuples;
+
+-- create new index entries, in the same ranges, this should see the dead entries and prune the pages
+INSERT INTO test_hash_killtuples(k, v) SELECT generate_series(1, :high * 0.3), 2;
+UPDATE test_hash_killtuples SET k = k + 1, v = 3 WHERE k > (:high * 0.6);
+
+-- do some basic cross checking of index scans vs sequential scan
+-- use prepared statement, so we can explain and execute without repeating the statement
+PREPARE check_query AS SELECT thk.v, min(thk.k), max(thk.k), count(*),
+  brute = ROW(thk.v, min(thk.k), max(thk.k), count(*)) AS are_equal
+FROM (
+    SELECT v, min(k), max(k), count(*)
+    FROM test_hash_killtuples GROUP BY v ORDER BY v
+  ) brute,
+  -- hash only does equality lookups, so simulate a range scan using generate_series
+  generate_series(brute.min, brute.max) g(i),
+  test_hash_killtuples thk
+WHERE
+  thk.k = g.i
+GROUP BY brute, thk.v
+ORDER BY thk.v;
+
+EXPLAIN (COSTS OFF) EXECUTE check_query;
+EXECUTE check_query;
+
+-- Leave the table/index around, so that an index that has been affected by
+-- killtuples could be seen by amcheck or such.
-- 
2.38.0

Drouvot, Bertrand

bertranddrouvot.pg@gmail.com

almost 3 years ago

In reply to: Andres Freund (#6)

Re: regression coverage gaps for gist and hash indexes

Hi,

On 4/1/23 1:13 AM, Andres Freund wrote:

Hi,

On 2023-03-31 17:00:00 +0300, Alexander Lakhin wrote:

31.03.2023 15:55, Tom Lane wrote:

See also the thread about bug #16329 [1]. Alexander promised to look
into improving the test coverage in this area, maybe he can keep an
eye on the WAL logic coverage too.

Yes, I'm going to analyze that area too. Maybe it'll take more time
(a week or two) if I encounter some bugs there (for now I observe anomalies
with gist__int_ops), but I will definitely try to improve the gist testing.

Because I needed it to verify the changes in the referenced patch, I wrote
tests exercising killtuples based pruning for gist and hash.

Thanks for the patch!

I did not looked at the detail but "just" checked that the coverage is now done.

And Indeed, when running "make check" + "027_stream_regress.pl":

I can see it moving from (without the patch):

function gistXLogDelete called 0 returned 0% blocks executed 0%
function gistRedoDeleteRecord called 0 returned 0% blocks executed 0%
function gistprunepage called 0 returned 0% blocks executed 0%
function _hash_vacuum_one_page called 0 returned 0% blocks executed 0%

to (with the patch):

function gistXLogDelete called 9 returned 100% blocks executed 100%
function gistRedoDeleteRecord called 5 returned 100% blocks executed 100% (thanks to 027_stream_regress.pl)
function gistprunepage called 9 returned 100% blocks executed 79%
function _hash_vacuum_one_page called 12 returned 100% blocks executed 94%

For now I left the new tests in their own files. But possibly they should be
in gist.sql and hash_index.sql respectively?

+1 to put them in gist.sql and hash_index.sql.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Drouvot, Bertrand (#7)

Re: regression coverage gaps for gist and hash indexes

Hi,

On 2023-04-01 06:02:47 +0200, Drouvot, Bertrand wrote:

On 4/1/23 1:13 AM, Andres Freund wrote:

On 2023-03-31 17:00:00 +0300, Alexander Lakhin wrote:

31.03.2023 15:55, Tom Lane wrote:

See also the thread about bug #16329 [1]. Alexander promised to look
into improving the test coverage in this area, maybe he can keep an
eye on the WAL logic coverage too.

Yes, I'm going to analyze that area too. Maybe it'll take more time
(a week or two) if I encounter some bugs there (for now I observe anomalies
with gist__int_ops), but I will definitely try to improve the gist testing.

Because I needed it to verify the changes in the referenced patch, I wrote
tests exercising killtuples based pruning for gist and hash.

Thanks for the patch!

I did not looked at the detail but "just" checked that the coverage is now done.

And Indeed, when running "make check" + "027_stream_regress.pl":

I can see it moving from (without the patch):

function gistXLogDelete called 0 returned 0% blocks executed 0%
function gistRedoDeleteRecord called 0 returned 0% blocks executed 0%
function gistprunepage called 0 returned 0% blocks executed 0%
function _hash_vacuum_one_page called 0 returned 0% blocks executed 0%

to (with the patch):

function gistXLogDelete called 9 returned 100% blocks executed 100%
function gistRedoDeleteRecord called 5 returned 100% blocks executed 100% (thanks to 027_stream_regress.pl)
function gistprunepage called 9 returned 100% blocks executed 79%
function _hash_vacuum_one_page called 12 returned 100% blocks executed 94%

For now I left the new tests in their own files. But possibly they should be
in gist.sql and hash_index.sql respectively?

+1 to put them in gist.sql and hash_index.sql.

Unfortunately it turns out that running them in a parallel group reliably
prevents cleanup of the dead rows, at least on my machine. Thereby preventing
any increase in coverage. As they need to run serially, I think it makes more
sense to keep the tests focussed and leave gist.sql and hash_index.sql to run
in parallel.

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andres Freund (#8)

Re: regression coverage gaps for gist and hash indexes

Andres Freund <andres@anarazel.de> writes:

On 2023-04-01 06:02:47 +0200, Drouvot, Bertrand wrote:

+1 to put them in gist.sql and hash_index.sql.

Unfortunately it turns out that running them in a parallel group reliably
prevents cleanup of the dead rows, at least on my machine. Thereby preventing
any increase in coverage. As they need to run serially, I think it makes more
sense to keep the tests focussed and leave gist.sql and hash_index.sql to run
in parallel.

If they have to run serially then that means that their runtime
contributes 1-for-1 to the total runtime of the core regression tests,
which is not nice at all. Can we move them to some other portion
of our test suite, preferably one that's not repeated four or more
times in each buildfarm run?

regards, tom lane

#10

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Tom Lane (#9)

Re: regression coverage gaps for gist and hash indexes

Hi,

On 2023-04-02 12:38:32 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-04-01 06:02:47 +0200, Drouvot, Bertrand wrote:

+1 to put them in gist.sql and hash_index.sql.

Unfortunately it turns out that running them in a parallel group reliably
prevents cleanup of the dead rows, at least on my machine. Thereby preventing
any increase in coverage. As they need to run serially, I think it makes more
sense to keep the tests focussed and leave gist.sql and hash_index.sql to run
in parallel.

If they have to run serially then that means that their runtime
contributes 1-for-1 to the total runtime of the core regression tests,
which is not nice at all.

Agreed, it's not nice. At least reasonably quick (74ms and 54ms on one run
here)...

Can we move them to some other portion of our test suite, preferably one
that's not repeated four or more times in each buildfarm run?

Not trivially, at least. Right now 027_stream_regress.pl doesn't run other
tests, so we'd not cover the replay portion if moved the tests to
contrib/btree_gist or such.

I wonder if we should have a test infrastructure function for waiting for the
visibility horizon to have passed a certain point. I think that might be
useful for making a few other tests robust...

Greetings,

Andres Freund

#11

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andres Freund (#10)

Re: regression coverage gaps for gist and hash indexes

Andres Freund <andres@anarazel.de> writes:

On 2023-04-02 12:38:32 -0400, Tom Lane wrote:

If they have to run serially then that means that their runtime
contributes 1-for-1 to the total runtime of the core regression tests,
which is not nice at all.

Agreed, it's not nice. At least reasonably quick (74ms and 54ms on one run
here)...

Oh, that's less bad than I expected. The discussion in the other thread
was pointing in the direction of needing hundreds of ms to make indexes
that are big enough to hit all the code paths.

Can we move them to some other portion of our test suite, preferably one
that's not repeated four or more times in each buildfarm run?

Not trivially, at least. Right now 027_stream_regress.pl doesn't run other
tests, so we'd not cover the replay portion if moved the tests to
contrib/btree_gist or such.

Yeah, I was imagining teaching 027_stream_regress.pl to run additional
scripts that aren't in the core tests. (I'm still quite unhappy that
027_stream_regress.pl uses the core tests at all, really, as they were
never particularly designed to cover what it cares about. The whole
thing is extremely inefficient and it's no surprise that its coverage
is scattershot.)

regards, tom lane

#12

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Tom Lane (#11)

Re: regression coverage gaps for gist and hash indexes

Hi,

On 2023-04-02 13:03:51 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-04-02 12:38:32 -0400, Tom Lane wrote:

If they have to run serially then that means that their runtime
contributes 1-for-1 to the total runtime of the core regression tests,
which is not nice at all.

Agreed, it's not nice. At least reasonably quick (74ms and 54ms on one run
here)...

Oh, that's less bad than I expected. The discussion in the other thread
was pointing in the direction of needing hundreds of ms to make indexes
that are big enough to hit all the code paths.

Well, the tests here really just try to hit the killtuples path, not some of
the paths discussed in [1]/messages/by-id/16329-7a6aa9b6fa1118a1@postgresql.org. It needs just enough index entries to encounter a
page split (which then is avoided by pruning tuples).

Looks like the test in [1]/messages/by-id/16329-7a6aa9b6fa1118a1@postgresql.org could be made a lot cheaper by changing effective_cache_size
for just that test:

/*
* In 'auto' mode, check if the index has grown too large to fit in cache,
* and switch to buffering mode if it has.
*
* To avoid excessive calls to smgrnblocks(), only check this every
* BUFFERING_MODE_SWITCH_CHECK_STEP index tuples.
*
* In 'stats' state, switch as soon as we have seen enough tuples to have
* some idea of the average tuple size.
*/
if ((buildstate->buildMode == GIST_BUFFERING_AUTO &&
buildstate->indtuples % BUFFERING_MODE_SWITCH_CHECK_STEP == 0 &&
effective_cache_size < smgrnblocks(RelationGetSmgr(index),
MAIN_FORKNUM)) ||
(buildstate->buildMode == GIST_BUFFERING_STATS &&
buildstate->indtuples >= BUFFERING_MODE_TUPLE_SIZE_STATS_TARGET))
{
/*
* Index doesn't fit in effective cache anymore. Try to switch to
* buffering build mode.
*/
gistInitBuffering(buildstate);
}

Can we move them to some other portion of our test suite, preferably one
that's not repeated four or more times in each buildfarm run?

Not trivially, at least. Right now 027_stream_regress.pl doesn't run other
tests, so we'd not cover the replay portion if moved the tests to
contrib/btree_gist or such.

Yeah, I was imagining teaching 027_stream_regress.pl to run additional
scripts that aren't in the core tests.

Not opposed to that, but I'm not quite sure about what we'd use as
infrastructure. A second schedule?

I think the tests patches I am proposing here are quite valuable to run
without replication involved as well, the killtuples path isn't trivial, so
having it be covered by the normal regression tests makes sense to me.

(I'm still quite unhappy that 027_stream_regress.pl uses the core tests at
all, really, as they were never particularly designed to cover what it cares
about. The whole thing is extremely inefficient and it's no surprise that
its coverage is scattershot.)

I don't think anybody would claim it's great as-is. But I still think that
having a meaningful coverage of replay is a heck of a lot better than not
having any, even if it's not a pretty or all that fast design. And the fact
that 027_stream_regress.pl runs with a small shared_buffers actually shook out
a few bugs...

I don't think we'd want to use a completely separate set of tests for
027_stream_regress.pl, typical tests will provide coverage on both the primary
and the standby, I think, and would just end up being duplicated between the
main regression test and something specific for 027_stream_regress.pl. But I
could imagine that it's worth maintaining a distinct version of
parallel_schedule that removes a tests that aren't likely to provide benenfits
for 027_stream_regress.pl.

Btw, from what I can tell, the main bottleneck for the main regression test
right now is the granular use of groups. Because the parallel groups have
fixed size limit, there's a stall waiting for the slowest test at the end of
each group. If we instead limited group concurrency solely in pg_regress,
instead of the schedule, a quick experiment suggest we could get a good bit of
speedup. And remove some indecision paralysis about which group to add a new
test to, as well as removing the annoyance of having to count the number of
tests in a group manually.

Greetings,

Andres Freund

[1]: /messages/by-id/16329-7a6aa9b6fa1118a1@postgresql.org

#13

Alexander Lakhin

exclusion@gmail.com

over 2 years ago

In reply to: Alexander Lakhin (#4)

1 attachment(s)

Re: regression coverage gaps for gist and hash indexes

Hi,

31.03.2023 17:00, Alexander Lakhin wrote:

31.03.2023 15:55, Tom Lane wrote:

See also the thread about bug #16329 [1]. Alexander promised to look
into improving the test coverage in this area, maybe he can keep an
eye on the WAL logic coverage too.

Yes, I'm going to analyze that area too. Maybe it'll take more time
(a week or two) if I encounter some bugs there (for now I observe anomalies
with gist__int_ops), but I will definitely try to improve the gist testing.

After 2+ weeks of researching I'd like to summarize my findings.
1) The checking query proposed in [1]/messages/by-id/20230331231300.4kkrl44usvy2pmkv@awork3.anarazel.de could be improved by adding
the restriction "tgk.v = brute.v" to the condition:
WHERE tgk.k >> point(brute.min - 1, 0) AND tgk.k << point(brute.max + 1, 0)
Otherwise that query gives a false positive after
insert into test_gist_killtuples values(point(505, 0));

The similar improved condition could be placed in hash_index_killtuples.sql.

Yet another improvement for the checking query could be done with the
replacement:
min(k <-> point(0, 0)), max(k <-> point(0, 0)) ->
min(k <-> point(0, k[1]/messages/by-id/20230331231300.4kkrl44usvy2pmkv@awork3.anarazel.de)), max(p <-> point(0, k[1]/messages/by-id/20230331231300.4kkrl44usvy2pmkv@awork3.anarazel.de)) ...

It doesn't change the query plan dramatically, but the query becomes more
universal (it would work for points with any non-negative integer x).

2) I've checked clang`s scan-build notices related to gist as I planned [2]/messages/by-id/cad7055f-0d76-cc31-71d5-f8b600ebb116@gmail.com,
namely:
Logic error    Branch condition evaluates to a garbage value src/backend/access/gist/gistutil.c    gistCompressValues    606
Logic error    Dereference of null pointer src/backend/access/gist/gist.c    gistFindCorrectParent    1099
Logic error    Dereference of null pointer src/backend/access/gist/gist.c    gistdoinsert    671
Logic error    Dereference of null pointer src/backend/access/gist/gist.c    gistfinishsplit    1339
Logic error    Dereference of null pointer src/backend/access/gist/gist.c    gistplacetopage    340
Logic error    Dereference of null pointer src/backend/access/gist/gistbuildbuffers.c gistPushItupToNodeBuffer    366
Logic error    Result of operation is garbage or undefined src/backend/access/gist/gistbuildbuffers.c
gistRelocateBuildBuffersOnSplit    677
Logic error    Result of operation is garbage or undefined src/backend/access/gist/gistutil.c    gistchoose    463
Unused code    Dead assignment    src/backend/access/gist/gist.c gistdoinsert    843

And found that all of them (except for the last one, that doesn't worth
fixing, IMO) are false positives (I can present detailed explanations if it
could be of interest.) So I see no grounds here to build new tests on.

3) To date I found other anomalies more or less related to gist:
fillfactor is ignored for sorted index build mode, which is effectively default now [3]/messages/by-id/fbbfe5dc-3dfa-d54a-3a94-e2bee37b85d8@gmail.com
amcheck verification for gist is not yet ready to use [4]/messages/by-id/885cfb61-26e9-e7c1-49a8-02b3fb12b497@gmail.com (and the latest patch doesn't apply to the current HEAD)
bug #17888: Incorrect memory access in gist__int_ops for an input array with many elements [5]/messages/by-id/17888-f72930e6b5ce8c14@postgresql.org

4) I've constructed some tests, that provide full coverage for
gistFindCorrectParent(), reach for "very rare situation", and for
gistfixsplit(), but all these tests execute concurrent queries, so they
can't be implemented as simple regression tests. Moreover, I could not find
any explicit problems when reaching those places (I used the checking query
from [1]/messages/by-id/20230331231300.4kkrl44usvy2pmkv@awork3.anarazel.de in absence of other means to check gist indexes), so I see no value
in developing (not to speak of committing) these tests for now. I'm going to
further explore the gist behavior in those dark corners, but it looks like
a long-term task, so I think it shouldn't delay the gist coverage improvement
already proposed.

5)
02.04.2023 20:50, Andres Freund wrote:

Looks like the test in [1] could be made a lot cheaper by changing effective_cache_size
for just that test:

The effective_cache_size is accounted only when buffering = auto, but in
that test we have buffering = on, so changing it wouldn't help there.

While looking at gist-related tests, I've noticed an incorrect comment
in index_including_gist.sql:
* 1.1. test CREATE INDEX with buffered build

It's incorrect exactly because with the default effective_cache_size the
buffered build mode is not enabled for that index size (I've confirmed
this with the elog(LOG,..) placed inside gistInitBuffering()).

So I'd like to propose the patch attached, that:
a) demonstrates the bug #16329:
With 8e5eef50c reverted, I get:
**00:00:00:11.179 1587838** Valgrind detected 1 error(s) during execution of "CREATE INDEX tbl_gist_idx ON tbl_gist
using gist (c4) INCLUDE (c1,c2,c3) WITH (buffering = on);"
b) makes the comment in index_including_gist.sql correct
c) increases a visible test coverage a little, in particular:
Function 'gistBuffersReleaseBlock'
-Lines executed:66.67% of 9
+Lines executed:100.00% of 9
d) doesn't increase the test duration significantly:
without valgrind I see difference: 84 ms -> 93 ms, under vagrind: 13513 ms -> 14511 ms

Thus, I think, it's worth to split the activity related to gist testing
improvement to finalizing/accepting the already-emerging patches and to
background research/anomaly findings, which could inspire further
enhancements in this area.

[1]: /messages/by-id/20230331231300.4kkrl44usvy2pmkv@awork3.anarazel.de
[2]: /messages/by-id/cad7055f-0d76-cc31-71d5-f8b600ebb116@gmail.com
[3]: /messages/by-id/fbbfe5dc-3dfa-d54a-3a94-e2bee37b85d8@gmail.com
[4]: /messages/by-id/885cfb61-26e9-e7c1-49a8-02b3fb12b497@gmail.com
[5]: /messages/by-id/17888-f72930e6b5ce8c14@postgresql.org

Best regards,
Alexander

Attachments:

0002-WIP-Improve-gist-testing-per-bug-#16329.patchtext/x-patch; charset=UTF-8; name=0002-WIP-Improve-gist-testing-per-bug-#16329.patchDownload

diff --git a/src/test/regress/expected/index_including_gist.out b/src/test/regress/expected/index_including_gist.out
index ed9906da66..953b4ebd8b 100644
--- a/src/test/regress/expected/index_including_gist.out
+++ b/src/test/regress/expected/index_including_gist.out
@@ -2,25 +2,25 @@
  * 1.1. test CREATE INDEX with buffered build
  */
 -- Regular index with included columns
-CREATE TABLE tbl_gist (c1 int, c2 int, c3 int, c4 box);
--- size is chosen to exceed page size and trigger actual truncation
-INSERT INTO tbl_gist SELECT x, 2*x, 3*x, box(point(x,x+1),point(2*x,2*x+1)) FROM generate_series(1,8000) AS x;
-CREATE INDEX tbl_gist_idx ON tbl_gist using gist (c4) INCLUDE (c1,c2,c3);
+CREATE TABLE tbl_gist (c1 int, c2 int, c3 text, c4 box);
+-- size is chosen to trigger buffer emptying (bug #16329).
+INSERT INTO tbl_gist SELECT x, 2*x, repeat('x', 100), box(point(x,x+1),point(2*x,2*x+1)) FROM generate_series(1,8000) AS x;
+CREATE INDEX tbl_gist_idx ON tbl_gist using gist (c4) INCLUDE (c1,c2,c3) WITH (buffering = on);
 SELECT pg_get_indexdef(i.indexrelid)
 FROM pg_index i JOIN pg_class c ON i.indexrelid = c.oid
 WHERE i.indrelid = 'tbl_gist'::regclass ORDER BY c.relname;
-                                  pg_get_indexdef                                  
------------------------------------------------------------------------------------
- CREATE INDEX tbl_gist_idx ON public.tbl_gist USING gist (c4) INCLUDE (c1, c2, c3)
+                                             pg_get_indexdef                                             
+---------------------------------------------------------------------------------------------------------
+ CREATE INDEX tbl_gist_idx ON public.tbl_gist USING gist (c4) INCLUDE (c1, c2, c3) WITH (buffering='on')
 (1 row)
 
 SELECT * FROM tbl_gist where c4 <@ box(point(1,1),point(10,10));
- c1 | c2 | c3 |     c4      
-----+----+----+-------------
-  1 |  2 |  3 | (2,3),(1,2)
-  2 |  4 |  6 | (4,5),(2,3)
-  3 |  6 |  9 | (6,7),(3,4)
-  4 |  8 | 12 | (8,9),(4,5)
+ c1 | c2 |                                                  c3                                                  |     c4      
+----+----+------------------------------------------------------------------------------------------------------+-------------
+  1 |  2 | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | (2,3),(1,2)
+  2 |  4 | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | (4,5),(2,3)
+  3 |  6 | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | (6,7),(3,4)
+  4 |  8 | xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | (8,9),(4,5)
 (4 rows)
 
 SET enable_bitmapscan TO off;
diff --git a/src/test/regress/sql/index_including_gist.sql b/src/test/regress/sql/index_including_gist.sql
index 7d5c99b2e7..60ba15f8f0 100644
--- a/src/test/regress/sql/index_including_gist.sql
+++ b/src/test/regress/sql/index_including_gist.sql
@@ -3,10 +3,10 @@
  */
 
 -- Regular index with included columns
-CREATE TABLE tbl_gist (c1 int, c2 int, c3 int, c4 box);
--- size is chosen to exceed page size and trigger actual truncation
-INSERT INTO tbl_gist SELECT x, 2*x, 3*x, box(point(x,x+1),point(2*x,2*x+1)) FROM generate_series(1,8000) AS x;
-CREATE INDEX tbl_gist_idx ON tbl_gist using gist (c4) INCLUDE (c1,c2,c3);
+CREATE TABLE tbl_gist (c1 int, c2 int, c3 text, c4 box);
+-- size is chosen to trigger buffer emptying (bug #16329).
+INSERT INTO tbl_gist SELECT x, 2*x, repeat('x', 100), box(point(x,x+1),point(2*x,2*x+1)) FROM generate_series(1,8000) AS x;
+CREATE INDEX tbl_gist_idx ON tbl_gist using gist (c4) INCLUDE (c1,c2,c3) WITH (buffering = on);
 SELECT pg_get_indexdef(i.indexrelid)
 FROM pg_index i JOIN pg_class c ON i.indexrelid = c.oid
 WHERE i.indrelid = 'tbl_gist'::regclass ORDER BY c.relname;