Can can I make an injection point wait occur no more than once?
I'm working on adding test coverage to _bt_lock_and_validate_left,
which was enhanced by Postgres 18 commit 1bd4bc85ca. In particular,
coverage of its unhappy path: the path where multiple concurrent page
splits necessitate that the scan (which generally moves to the left)
moves to the right multiple times, until finally it gives up. When it
gives up it returns to the original lastcurrblkno to see what's up
with it -- it'll need to get that page's now-current left sibling
link, beginning the whole process anew (by looping back to the start
of _bt_lock_and_validate_left).
An isolation test that uses injection points seems like a natural
approach (actually, it's likely the *only* approach that can produce a
maintainable test). One session should perform a backwards scan that
is forced to wait at the top of _bt_lock_and_validate_left. Another
session then inserts enough index tuples to cause several leaf page
splits that'll make life harder for the backwards scan. Finally, we
wake the backwards scan session, and get the desired test coverage;
it'll reliably have to do things the hard way.
I have all this working already. However, there are certain aspects of
the isolation test (and the injection points themselves) that seem
unsatisfactory. I could really use a way to make the wait within
_bt_lock_and_validate_left happen no more than once, in a way that's
directly under the control of my isolation test.
Any test like this needs to account for various implementation
details. For example, if the test needs to work with non-standard
BLCKSZ (which seems like a good idea), then the number of page splits
required might be greater or fewer than with standard BLCKSZ. This
shouldn't really be a problem; it necessitates inserting more data
than is strictly necessary most of the time: there needs to be some
margin or error to account for these effects. But that shouldn't be
much of a problem.
However, as things stand, this does create a problem: accounting for
these implementation details in this manner makes the number of times
that the injection point is reached unpredictable/hard to control. I
only want the wait within _bt_lock_and_validate_left to happen once,
before the concurrent inserts take place from within the other
isolation test session. I don't want any possible future calls to
_bt_lock_and_validate_left (that come after the other session is done)
to wait at all -- that'll make the backwards scan test session wait
forever (since no other session will be around to wake it up a second
or a third time).
I have successfully simulated "wait no more than once" by adding C
code to nbtree that looks like this:
if (likely(!P_ISDELETED(opaque) &&
opaque->btpo_next == lastcurrblkno))
{
/* Found desired page, return it */
#ifdef USE_INJECTION_POINTS
if (IS_INJECTION_POINT_ATTACHED("lock-and-validate-left"))
{
InjectionPointDetach("lock-and-validate-left");
}
#endif
But that's pretty ugly and non-modular. There are multiple return
paths within _bt_lock_and_validate_left, and I'd probably need to
cover them all with similar code. That seems borderline unacceptable.
It would be far preferable if I could just use some built-in way of
waiting exactly once, that can be used directly from SQL, through the
injection_points extension. That would allow me to write the isolation
test without having to add any code to nbtsearch.c that knows all
about the requirements of one particular isolation test.
Thanks
--
Peter Geoghegan
On Mon, Jul 07, 2025 at 05:31:30PM -0400, Peter Geoghegan wrote:
I have successfully simulated "wait no more than once" by adding C
code to nbtree that looks like this:if (likely(!P_ISDELETED(opaque) &&
opaque->btpo_next == lastcurrblkno))
{
/* Found desired page, return it */
#ifdef USE_INJECTION_POINTS
if (IS_INJECTION_POINT_ATTACHED("lock-and-validate-left"))
{
InjectionPointDetach("lock-and-validate-left");
}
#endifBut that's pretty ugly and non-modular. There are multiple return
paths within _bt_lock_and_validate_left, and I'd probably need to
cover them all with similar code. That seems borderline unacceptable.It would be far preferable if I could just use some built-in way of
waiting exactly once, that can be used directly from SQL, through the
injection_points extension. That would allow me to write the isolation
test without having to add any code to nbtsearch.c that knows all
about the requirements of one particular isolation test.
In your test, just detach the injection point while the backend under test is
waiting at the injection point. All of
src/test/modules/injection_points/specs/*.spec use that technique.
On Mon, Jul 7, 2025 at 6:02 PM Noah Misch <noah@leadboat.com> wrote:
In your test, just detach the injection point while the backend under test is
waiting at the injection point. All of
src/test/modules/injection_points/specs/*.spec use that technique.
That appears to work (without the kludge I added to nbtsearch.c),
though I find that I need to detach the injection point *and* wake up
the waiting backend. In that order. Thanks!
For what it's worth, I found
src/test/modules/injection_points/specs/basic.spec (which is supposed
to serve as a template) hard to follow. The comments don't seem to
explain what the detach and wait functions actually do, and how and
why one might want to call them together.
--
Peter Geoghegan
On Mon, Jul 07, 2025 at 06:31:33PM -0400, Peter Geoghegan wrote:
On Mon, Jul 7, 2025 at 6:02 PM Noah Misch <noah@leadboat.com> wrote:
In your test, just detach the injection point while the backend under test is
waiting at the injection point. All of
src/test/modules/injection_points/specs/*.spec use that technique.That appears to work (without the kludge I added to nbtsearch.c),
though I find that I need to detach the injection point *and* wake up
the waiting backend. In that order. Thanks!
That's a property that Noah was looking after when he's worked on his
specs with the VACUUM/GRANT frictions, something that one would get
with a debugger: keep waiting and allow the point to be detached in
parallel.
For what it's worth, I found
src/test/modules/injection_points/specs/basic.spec (which is supposed
to serve as a template) hard to follow. The comments don't seem to
explain what the detach and wait functions actually do, and how and
why one might want to call them together.
If you see ways to improve the existing template, please feel free to
propose something, sure.
--
Michael
On Mon, Jul 7, 2025 at 7:43 PM Michael Paquier <michael@paquier.xyz> wrote:
That's a property that Noah was looking after when he's worked on his
specs with the VACUUM/GRANT frictions, something that one would get
with a debugger: keep waiting and allow the point to be detached in
parallel.
I'm finding that the FreeBSD Meson CI target consistently fails with
this setup, though. And with just about any variant I can think of;
seems to fail quite reliably. The initial SELECT backwards scan
statement will complete without ever waiting (though only on CI).
Do you know what that might be? It would be a lot easier if there was
at least a way to debug this locally.
For what it's worth, I found
src/test/modules/injection_points/specs/basic.spec (which is supposed
to serve as a template) hard to follow. The comments don't seem to
explain what the detach and wait functions actually do, and how and
why one might want to call them together.If you see ways to improve the existing template, please feel free to
propose something, sure.
I'll need to figure this out for myself first.
--
Peter Geoghegan
On Mon, Jul 07, 2025 at 09:40:20PM -0400, Peter Geoghegan wrote:
I'm finding that the FreeBSD Meson CI target consistently fails with
this setup, though. And with just about any variant I can think of;
seems to fail quite reliably. The initial SELECT backwards scan
statement will complete without ever waiting (though only on CI).Do you know what that might be? It would be a lot easier if there was
at least a way to debug this locally.
FreeBSD's scheduler is different enough to exercise quite-different relative
timings of process wake-up. I got a lot of FreeBSD failures when my tests had
underspecified the order of events.
If it continues to be a problem, consider sharing the patch that's behaving
this way for you.
On Mon, Jul 7, 2025 at 9:53 PM Noah Misch <noah@leadboat.com> wrote:
If it continues to be a problem, consider sharing the patch that's behaving
this way for you.
Attached patch shows my current progress with the isolation test.
I also attach diff output of the FreeBSD failures. Notice that the
line "backwards_scan_session: NOTICE: notice triggered for injection
point lock-and-validate-new-lastcurrblkno" is completely absent from
the test output. This absence indicates that the desired test coverage
is totally missing on FreeBSD -- so the test is completely broken on
FreeBSD.
I ran "meson test --suite setup --suite nbtree -q --print-errorlogs"
in a loop 500 times on my Debian workstation without seeing any
failures. Seems stable there. Whereas the FreeBSD target hasn't even
passed once out of more than a dozen attempts. Seems to be reliably
broken on FreeBSD.
Thanks
--
Peter Geoghegan
Attachments:
regression.txttext/plain; charset=US-ASCII; name=regression.txtDownload
diff -U3 /tmp/cirrus-ci-build/src/test/modules/nbtree/expected/backwards.out /tmp/cirrus-ci-build/build/testrun/nbtree/isolation/results/backwards.out
--- /tmp/cirrus-ci-build/src/test/modules/nbtree/expected/backwards.out 2025-07-08 14:58:24.118145000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/nbtree/isolation/results/backwards.out 2025-07-08 15:01:24.642562000 +0000
@@ -1,24 +1,7 @@
Parsed test spec with 2 sessions
starting permutation: b_scan i_insert i_detach
-step b_scan: SELECT * FROM backwards_scan_tbl WHERE col % 100 = 1 ORDER BY col DESC; <waiting ...>
-step i_insert: INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(-2000, 700) i;
-step i_detach:
- SELECT injection_points_detach('lock-and-validate-left');
- SELECT injection_points_wakeup('lock-and-validate-left');
-
-injection_points_detach
------------------------
-
-(1 row)
-
-injection_points_wakeup
------------------------
-
-(1 row)
-
-backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno
-step b_scan: <... completed>
+step b_scan: SELECT * FROM backwards_scan_tbl WHERE col % 100 = 1 ORDER BY col DESC;
col
---
601
@@ -30,3 +13,14 @@
1
(7 rows)
+step i_insert: INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(-2000, 700) i;
+step i_detach:
+ SELECT injection_points_detach('lock-and-validate-left');
+ SELECT injection_points_wakeup('lock-and-validate-left');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+ERROR: could not find injection point lock-and-validate-left to wake up
0001-injection-points.patchapplication/octet-stream; name=0001-injection-points.patchDownload
From edd7a401e438c4c549037c95e48dcd48a233cb7d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@bowt.ie>
Date: Mon, 7 Jul 2025 14:11:17 -0400
Subject: [PATCH 1/3] injection points
---
src/backend/access/nbtree/nbtsearch.c | 4 ++
src/test/modules/Makefile | 4 +-
src/test/modules/meson.build | 1 +
src/test/modules/nbtree/.gitignore | 4 ++
src/test/modules/nbtree/Makefile | 28 +++++++++++
.../modules/nbtree/expected/backwards.out | 32 +++++++++++++
src/test/modules/nbtree/meson.build | 17 +++++++
src/test/modules/nbtree/specs/backwards.spec | 46 +++++++++++++++++++
8 files changed, 134 insertions(+), 2 deletions(-)
create mode 100644 src/test/modules/nbtree/.gitignore
create mode 100644 src/test/modules/nbtree/Makefile
create mode 100644 src/test/modules/nbtree/expected/backwards.out
create mode 100644 src/test/modules/nbtree/meson.build
create mode 100644 src/test/modules/nbtree/specs/backwards.spec
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 4af1ff1e9..c250a4b92 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -21,6 +21,7 @@
#include "miscadmin.h"
#include "pgstat.h"
#include "storage/predicate.h"
+#include "utils/injection_point.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
@@ -2500,6 +2501,8 @@ _bt_lock_and_validate_left(Relation rel, BlockNumber *blkno,
{
BlockNumber origblkno = *blkno; /* detects circular links */
+ INJECTION_POINT("lock-and-validate-left", NULL);
+
for (;;)
{
Buffer buf;
@@ -2598,6 +2601,7 @@ _bt_lock_and_validate_left(Relation rel, BlockNumber *blkno,
/* Start from scratch with new lastcurrblkno's blkno/prev link */
*blkno = origblkno = opaque->btpo_prev;
_bt_relbuf(rel, buf);
+ INJECTION_POINT("lock-and-validate-new-lastcurrblkno", NULL);
}
return InvalidBuffer;
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index aa1d27bbe..06b0de718 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -46,9 +46,9 @@ SUBDIRS = \
ifeq ($(enable_injection_points),yes)
-SUBDIRS += injection_points gin typcache
+SUBDIRS += injection_points gin nbtree typcache
else
-ALWAYS_SUBDIRS += injection_points gin typcache
+ALWAYS_SUBDIRS += injection_points gin nbtree typcache
endif
ifeq ($(with_ssl),openssl)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index 9de0057bd..54a3e7ba5 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -9,6 +9,7 @@ subdir('gin')
subdir('injection_points')
subdir('ldap_password_func')
subdir('libpq_pipeline')
+subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
subdir('spgist_name_ops')
diff --git a/src/test/modules/nbtree/.gitignore b/src/test/modules/nbtree/.gitignore
new file mode 100644
index 000000000..5dcb3ff97
--- /dev/null
+++ b/src/test/modules/nbtree/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/nbtree/Makefile b/src/test/modules/nbtree/Makefile
new file mode 100644
index 000000000..bdcf80e60
--- /dev/null
+++ b/src/test/modules/nbtree/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/nbtree/Makefile
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+ISOLATION = backwards
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/nbtree
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/nbtree/expected/backwards.out b/src/test/modules/nbtree/expected/backwards.out
new file mode 100644
index 000000000..6b334a2ac
--- /dev/null
+++ b/src/test/modules/nbtree/expected/backwards.out
@@ -0,0 +1,32 @@
+Parsed test spec with 2 sessions
+
+starting permutation: b_scan i_insert i_detach
+step b_scan: SELECT * FROM backwards_scan_tbl WHERE col % 100 = 1 ORDER BY col DESC; <waiting ...>
+step i_insert: INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(-2000, 700) i;
+step i_detach:
+ SELECT injection_points_detach('lock-and-validate-left');
+ SELECT injection_points_wakeup('lock-and-validate-left');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+injection_points_wakeup
+-----------------------
+
+(1 row)
+
+backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno
+step b_scan: <... completed>
+col
+---
+601
+501
+401
+301
+201
+101
+ 1
+(7 rows)
+
diff --git a/src/test/modules/nbtree/meson.build b/src/test/modules/nbtree/meson.build
new file mode 100644
index 000000000..20d77077b
--- /dev/null
+++ b/src/test/modules/nbtree/meson.build
@@ -0,0 +1,17 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'nbtree',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'isolation': {
+ 'specs': [
+ 'backwards',
+ ],
+ 'runningcheck': false, # see syscache-update-pruned
+ },
+}
diff --git a/src/test/modules/nbtree/specs/backwards.spec b/src/test/modules/nbtree/specs/backwards.spec
new file mode 100644
index 000000000..780929ee1
--- /dev/null
+++ b/src/test/modules/nbtree/specs/backwards.spec
@@ -0,0 +1,46 @@
+# Backwards scan isolation test
+
+setup
+{
+ CREATE EXTENSION injection_points;
+ CREATE TABLE backwards_scan_tbl(col int4) WITH (autovacuum_enabled = off);
+ CREATE INDEX ON backwards_scan_tbl(col);
+ INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(0, 700) i;
+}
+setup
+{
+ VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) backwards_scan_tbl;
+}
+teardown
+{
+ DROP EXTENSION injection_points;
+ DROP TABLE backwards_scan_tbl;
+}
+
+# Wait happens in backwards_scan_session session, wakeup in insert session
+#
+# The lock-and-validate-new-lastcurrblkno injection point is used for the
+# wait. The lock-and-validate-left injection point is used to generate a
+# notification that confirms that we have the desired test coverage.
+session backwards_scan_session
+setup {
+ SELECT injection_points_set_local();
+ SELECT injection_points_attach('lock-and-validate-new-lastcurrblkno', 'notice');
+ SELECT injection_points_attach('lock-and-validate-left', 'wait');
+ SET enable_seqscan=off;
+ SET enable_sort=off;
+}
+step b_scan { SELECT * FROM backwards_scan_tbl WHERE col % 100 = 1 ORDER BY col DESC; }
+
+session insert_scan_session
+step i_detach {
+ SELECT injection_points_detach('lock-and-validate-left');
+ SELECT injection_points_wakeup('lock-and-validate-left');
+}
+step i_insert { INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(-2000, 700) i; }
+
+# Start a backwards scan session that waits "between pages". Meanwhile, a
+# concurrent session performs insertions that cause many page splits. When
+# the backwards scan session wakes up, it'll have to reason about these
+# concurrent page splits the hard way.
+permutation b_scan i_insert i_detach
--
2.50.0
On Tue, Jul 08, 2025 at 11:21:20AM -0400, Peter Geoghegan wrote:
On Mon, Jul 7, 2025 at 9:53 PM Noah Misch <noah@leadboat.com> wrote:
If it continues to be a problem, consider sharing the patch that's behaving
this way for you.Attached patch shows my current progress with the isolation test.
Nothing looks suspicious in that code.
I also attach diff output of the FreeBSD failures. Notice that the
line "backwards_scan_session: NOTICE: notice triggered for injection
point lock-and-validate-new-lastcurrblkno" is completely absent from
the test output. This absence indicates that the desired test coverage
is totally missing on FreeBSD -- so the test is completely broken on
FreeBSD.I ran "meson test --suite setup --suite nbtree -q --print-errorlogs"
in a loop 500 times on my Debian workstation without seeing any
failures. Seems stable there. Whereas the FreeBSD target hasn't even
passed once out of more than a dozen attempts. Seems to be reliably
broken on FreeBSD.
-backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno +ERROR: could not find injection point lock-and-validate-left to wake up
Agreed. Perhaps it's getting a different plan type on FreeBSD, so it's not
even reaching the INJECTION_POINT() calls? That would be consistent with
these output diffs having no ERROR from attach/detach. Some things I'd try:
- Add a plain elog(WARNING) before each INJECTION_POINT()
- Use debug_print_plan or similar to confirm the plan type
On Tue, Jul 8, 2025 at 11:04 PM Noah Misch <noah@leadboat.com> wrote:
-backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno +ERROR: could not find injection point lock-and-validate-left to wake upAgreed. Perhaps it's getting a different plan type on FreeBSD, so it's not
even reaching the INJECTION_POINT() calls? That would be consistent with
these output diffs having no ERROR from attach/detach. Some things I'd try:- Add a plain elog(WARNING) before each INJECTION_POINT()
- Use debug_print_plan or similar to confirm the plan type
I added a pair of elog(WARNING) traces before each of the new
INJECTION_POINT() calls.
When I run the test against the FreeBSD CI target with this new
instrumentation, I see a WARNING that indicates that we've reached the
top of _bt_lock_and_validate_left as expected. I don't see any second
WARNING indicating that we've taken _bt_lock_and_validate_left's
unhappy path, though (and the test still fails). This doesn't look
like an issue with the planner.
I attach the relevant regression test output, that shows all this.
Thanks
--
Peter Geoghegan
Attachments:
warning-regression.txttext/plain; charset=US-ASCII; name=warning-regression.txtDownload
diff -U3 /tmp/cirrus-ci-build/src/test/modules/nbtree/expected/backwards.out /tmp/cirrus-ci-build/build/testrun/nbtree/isolation/results/backwards.out
--- /tmp/cirrus-ci-build/src/test/modules/nbtree/expected/backwards.out 2025-07-09 03:22:42.701999000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/nbtree/isolation/results/backwards.out 2025-07-09 03:26:13.802340000 +0000
@@ -1,24 +1,8 @@
Parsed test spec with 2 sessions
starting permutation: b_scan i_insert i_detach
-step b_scan: SELECT * FROM backwards_scan_tbl WHERE col % 100 = 1 ORDER BY col DESC; <waiting ...>
-step i_insert: INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(-2000, 700) i;
-step i_detach:
- SELECT injection_points_detach('lock-and-validate-left');
- SELECT injection_points_wakeup('lock-and-validate-left');
-
-injection_points_detach
------------------------
-
-(1 row)
-
-injection_points_wakeup
------------------------
-
-(1 row)
-
-backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno
-step b_scan: <... completed>
+backwards_scan_session: WARNING: INJECTION_POINT lock-and-validate-left
+step b_scan: SELECT * FROM backwards_scan_tbl WHERE col % 100 = 1 ORDER BY col DESC;
col
---
601
@@ -30,3 +14,14 @@
1
(7 rows)
+step i_insert: INSERT INTO backwards_scan_tbl SELECT i FROM generate_series(-2000, 700) i;
+step i_detach:
+ SELECT injection_points_detach('lock-and-validate-left');
+ SELECT injection_points_wakeup('lock-and-validate-left');
+
+injection_points_detach
+-----------------------
+
+(1 row)
+
+ERROR: could not find injection point lock-and-validate-left to wake up
On Tue, Jul 08, 2025 at 11:43:17PM -0400, Peter Geoghegan wrote:
On Tue, Jul 8, 2025 at 11:04 PM Noah Misch <noah@leadboat.com> wrote:
-backwards_scan_session: NOTICE: notice triggered for injection point lock-and-validate-new-lastcurrblkno +ERROR: could not find injection point lock-and-validate-left to wake upAgreed. Perhaps it's getting a different plan type on FreeBSD, so it's not
even reaching the INJECTION_POINT() calls? That would be consistent with
these output diffs having no ERROR from attach/detach. Some things I'd try:- Add a plain elog(WARNING) before each INJECTION_POINT()
- Use debug_print_plan or similar to confirm the plan typeI added a pair of elog(WARNING) traces before each of the new
INJECTION_POINT() calls.When I run the test against the FreeBSD CI target with this new
instrumentation, I see a WARNING that indicates that we've reached the
top of _bt_lock_and_validate_left as expected. I don't see any second
WARNING indicating that we've taken _bt_lock_and_validate_left's
unhappy path, though (and the test still fails). This doesn't look
like an issue with the planner.I attach the relevant regression test output, that shows all this.
Looking at .cirrus.tasks.yml, I bet the key factor is that CI task using
debug_parallel_query=regress. I bet the leader is attached to the injection
point, but the WARNING is reached in a parallel worker.
If that matches what you see, I'd use a PARALLEL RESTRICTED or PARALLEL UNSAFE
function in your query to ensure the code in question runs in the leader.
(Simply overriding debug_parallel_query is less robust, because test runs
could use other settings that cause selection of a parallel plan.)
On Wed, Jul 9, 2025 at 10:24 PM Noah Misch <noah@leadboat.com> wrote:
Looking at .cirrus.tasks.yml, I bet the key factor is that CI task using
debug_parallel_query=regress. I bet the leader is attached to the injection
point, but the WARNING is reached in a parallel worker.
Yep, that was it.
If that matches what you see, I'd use a PARALLEL RESTRICTED or PARALLEL UNSAFE
function in your query to ensure the code in question runs in the leader.
That seems like the way to go.
At some point I'll start a new thread with a formal patch proposal,
that'll include the tests on this thread. I also plan on using
injection points to write a simple/serial regression test exercising
the nbtree code that completes an incomplete split (following a hard
crash/error).
Thanks again
--
Peter Geoghegan
On Thu, Jul 10, 2025 at 06:58:58PM -0400, Peter Geoghegan wrote:
On Wed, Jul 9, 2025 at 10:24 PM Noah Misch <noah@leadboat.com> wrote:
Looking at .cirrus.tasks.yml, I bet the key factor is that CI task using
debug_parallel_query=regress. I bet the leader is attached to the injection
point, but the WARNING is reached in a parallel worker.Yep, that was it.
Catching up on things a bit. Cool to see that you have found out the
origin of the problem.
At some point I'll start a new thread with a formal patch proposal,
that'll include the tests on this thread. I also plan on using
injection points to write a simple/serial regression test exercising
the nbtree code that completes an incomplete split (following a hard
crash/error).
It sounds to me that an ERROR in an SQL and/or isolation test would be
enough. If you are looking at some replay cases, a TAP test would be
the way to go.
--
Michael