BRIN autosummarization lacking a snapshot
If you have a BRIN index with autosummarize=on, and it's based on a
function that requires a snapshot, autovacuum will simply error out when
trying to summarize a range. Here's a reproducer:
create table journal (d timestamp);
create function packdate(d timestamp) returns text as $$
begin return to_char(d, 'yyyymm'); end; $$
language plpgsql returns null on null input immutable;
create index on journal using brin (packdate(d))
with (autosummarize = on, pages_per_range = 1);
Now insert some data,
insert into journal select generate_series(timestamp '2025-01-01', '2025-12-31', '1 day');
and wait for autovacuum to fire. You'll get an error like
2025-11-03 12:17:42.263 CET [536755] ERROR: cannot execute SQL without an outer snapshot or portal
This patch fixes it. I haven't given much thought to adding a good way
to test this yet ...
--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude." (Brian Kernighan)
Attachments:
0001-Brin-autosummarization-may-need-a-snapshot.patchtext/x-diff; charset=utf-8Download
From e0f81d98b08512f540300a0c3061480b62917b57 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=81lvaro=20Herrera?= <alvherre@kurilemu.de>
Date: Mon, 3 Nov 2025 12:20:22 +0100
Subject: [PATCH] Brin autosummarization may need a snapshot
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It's possible to define BRIN indexes on functions that require a
snapshot to run, but the autosummarization feature introduced by commit
7526e10224f0 fails to provide one. Repair.
Author: Ãlvaro Herrera <alvherre@kurilemu.de>
Reported-by: Giovanni Fabris <giovanni.fabris@icon.it>
Reported-by: Arthur Nascimento <tureba@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/202511031106.h4fwyuyui6fz@alvherre.pgsql
---
src/backend/postmaster/autovacuum.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 23cca675f00..2d10ee22377 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2596,7 +2596,9 @@ deleted:
workitem->avw_active = true;
LWLockRelease(AutovacuumLock);
+ PushActiveSnapshot(GetTransactionSnapshot());
perform_work_item(workitem);
+ PopActiveSnapshot();
/*
* Check for config changes before acquiring lock for further jobs.
--
2.47.3
On Mon, Nov 03, 2025 at 12:21:50PM +0100, Alvaro Herrera wrote:
This patch fixes it. I haven't given much thought to adding a good way
to test this yet ...
Spawning an autovacuum worker can feel artistic as we try to make the
tests run fast, but it's not that bad. The trick is to use an
"autovacuum_naptime = 1". Then you could either scan the server logs
for some 'autovacuum: processing database "blah"', or just a polling
query based on pg_stat_all_tables.autovacuum_count. See for example
006_signal_autovacuum.pl.
--
Michael
On 2025-Nov-04, Michael Paquier wrote:
Spawning an autovacuum worker can feel artistic as we try to make the
tests run fast, but it's not that bad. The trick is to use an
"autovacuum_naptime = 1". Then you could either scan the server logs
for some 'autovacuum: processing database "blah"', or just a polling
query based on pg_stat_all_tables.autovacuum_count. See for example
006_signal_autovacuum.pl.
Ah yes ... and, actually, we already have a file doing a closely related
thing, so I added to it. Here's the patch for master. Backbranches are
essentially identical, modulo these changes for 13 and 14:
-use Test::More tests => 2;
+use Test::More tests => 4;
I'm glad we got rid of that :-)
With my initial try of this test, just counting the number of BRIN
tuples, I was _really_ surprised that the index did indeed contain the
expected number of tuples, even when the error was being thrown. This
turned out to be expected, because the way BRIN summarization works is
that we insert a placeholder tuple first, then update it to the correct
value, and the error only aborts the second part. That's why I needed
to add a WHERE clause to only count non-placeholder tuples.
I also added a 'sleep(1)', to avoid looping on the query when we know
autovacuum can't possibly have had a chance to run yet.
I unleashed CI on branches 15 and master, and will push soon if they
both turn green.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"La virtud es el justo medio entre dos defectos" (Aristóteles)
Attachments:
v2-0001-BRIN-autosummarization-may-need-a-snapshot.patchtext/x-diff; charset=utf-8Download
From fc8067aa4d905edf1f53a097c87b1e0883dd7c8c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=81lvaro=20Herrera?= <alvherre@kurilemu.de>
Date: Tue, 4 Nov 2025 12:38:24 +0100
Subject: [PATCH v2] BRIN autosummarization may need a snapshot
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
It's possible to define BRIN indexes on functions that require a
snapshot to run, but the autosummarization feature introduced by commit
7526e10224f0 fails to provide one. Repair, and add a test to verify
that it works.
Author: Ãlvaro Herrera <alvherre@kurilemu.de>
Backpatch-through: 13
---
src/backend/postmaster/autovacuum.c | 2 ++
src/test/modules/brin/t/01_workitems.pl | 46 ++++++++++++++++++++++---
2 files changed, 44 insertions(+), 4 deletions(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 5084af7dfb6..59ec45a4e96 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2556,7 +2556,9 @@ deleted:
workitem->avw_active = true;
LWLockRelease(AutovacuumLock);
+ PushActiveSnapshot(GetTransactionSnapshot());
perform_work_item(workitem);
+ PopActiveSnapshot();
/*
* Check for config changes before acquiring lock for further jobs.
diff --git a/src/test/modules/brin/t/01_workitems.pl b/src/test/modules/brin/t/01_workitems.pl
index c3b1fb51706..0744b8825ef 100644
--- a/src/test/modules/brin/t/01_workitems.pl
+++ b/src/test/modules/brin/t/01_workitems.pl
@@ -24,23 +24,61 @@ $node->safe_psql(
create index brin_wi_idx on brin_wi using brin (a) with (pages_per_range=1, autosummarize=on);
'
);
+# Another table with an index that requires a snapshot to run
+$node->safe_psql(
+ 'postgres',
+ 'create table journal (d timestamp) with (fillfactor = 10);
+ create function packdate(d timestamp) returns text language plpgsql
+ as $$ begin return to_char(d, \'yyyymm\'); end; $$
+ returns null on null input immutable;
+ create index brin_packdate_idx on journal using brin (packdate(d))
+ with (autosummarize = on, pages_per_range = 1);
+ '
+);
+
my $count = $node->safe_psql('postgres',
"select count(*) from brin_page_items(get_raw_page('brin_wi_idx', 2), 'brin_wi_idx'::regclass)"
);
-is($count, '1', "initial index state is correct");
+is($count, '1', "initial brin_wi_index index state is correct");
+$count = $node->safe_psql('postgres',
+ "select count(*) from brin_page_items(get_raw_page('brin_packdate_idx', 2), 'brin_packdate_idx'::regclass)"
+);
+is($count, '1', "initial brin_packdate_idx index state is correct");
$node->safe_psql('postgres',
'insert into brin_wi select * from generate_series(1, 100)');
+$node->safe_psql('postgres',
+ "insert into journal select * from generate_series(timestamp '1976-08-01', '1976-10-28', '1 day')"
+);
+
+# Give a little time for autovacuum to react. This matches the naptime
+# configured above.
+sleep(1);
$node->poll_query_until(
'postgres',
"select count(*) > 1 from brin_page_items(get_raw_page('brin_wi_idx', 2), 'brin_wi_idx'::regclass)",
't');
-$count = $node->safe_psql('postgres',
- "select count(*) > 1 from brin_page_items(get_raw_page('brin_wi_idx', 2), 'brin_wi_idx'::regclass)"
+$count = $node->safe_psql(
+ 'postgres',
+ "select count(*) from brin_page_items(get_raw_page('brin_wi_idx', 2), 'brin_wi_idx'::regclass)
+ where not placeholder;"
);
-is($count, 't', "index got summarized");
+cmp_ok($count, '>', '1', "$count brin_wi_idx ranges got summarized");
+
+$node->poll_query_until(
+ 'postgres',
+ "select count(*) > 1 from brin_page_items(get_raw_page('brin_packdate_idx', 2), 'brin_packdate_idx'::regclass)",
+ 't');
+
+$count = $node->safe_psql(
+ 'postgres',
+ "select count(*) from brin_page_items(get_raw_page('brin_packdate_idx', 2), 'brin_packdate_idx'::regclass)
+ where not placeholder;"
+);
+cmp_ok($count, '>', '1', "$count brin_packdate_idx ranges got summarized");
+
$node->stop;
done_testing();
--
2.47.3
With my initial try of this test, just counting the number of BRIN
tuples, I was _really_ surprised that the index did indeed contain the
expected number of tuples, even when the error was being thrown. This
turned out to be expected, because the way BRIN summarization works is
that we insert a placeholder tuple first, then update it to the correct
value, and the error only aborts the second part.
One thing that's not fully clear to me, but will test later, is that if
this has happened to you, then the placeholder tuple remains in place
and doesn't ever become non-placeholder. If vacuum (incl. autovacuum)
sees such a tuple, it will gladly ignore the page range, as if it were
already summarized. This makes your index scans potentially
inefficient, because for placeholder tuples bringetbitmap will always
include the affected page range.
I think we should make vacuum clean it up somehow, but I'm not yet sure
how safe it is.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)
Hi,
On Tue, Nov 4, 2025 at 5:22 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
With my initial try of this test, just counting the number of BRIN
tuples, I was _really_ surprised that the index did indeed contain the
expected number of tuples, even when the error was being thrown. This
turned out to be expected, because the way BRIN summarization works is
that we insert a placeholder tuple first, then update it to the correct
value, and the error only aborts the second part.One thing that's not fully clear to me, but will test later, is that if
this has happened to you, then the placeholder tuple remains in place
and doesn't ever become non-placeholder. If vacuum (incl. autovacuum)
sees such a tuple, it will gladly ignore the page range, as if it were
already summarized. This makes your index scans potentially
inefficient, because for placeholder tuples bringetbitmap will always
include the affected page range.I think we should make vacuum clean it up somehow, but I'm not yet sure
how safe it is.
+1. I noticed the same behavior while working on amcheck support for BRIN.
I think it should be safe because we hold ShareUpdateExclusiveLock
during any summarization, desummarization, or vacuum. So it means we
can't have concurrent summarization during the vacuum. So if we
encounter a placeholder during vacuum, it is always the result of
"aborted" summarization. At least brinRevmapDesummarizeRange always
treats placeholders this way. Maybe we can use
brinRevmapDesummarizeRange for this? FWIW there is a deadlock issue
with brinRevmapDesummarizeRange [0]/messages/by-id/261e68bc-f5f5-5234-fb2c-af4f583513c0@enterprisedb.com, maybe it is worth addressing if
we want to start using it during vacuum.
[0]: /messages/by-id/261e68bc-f5f5-5234-fb2c-af4f583513c0@enterprisedb.com
Best regards,
Arseniy Mukhin
On 2025-Nov-04, Álvaro Herrera wrote:
With my initial try of this test, just counting the number of BRIN
tuples, I was _really_ surprised that the index did indeed contain the
expected number of tuples, even when the error was being thrown. This
turned out to be expected, because the way BRIN summarization works is
that we insert a placeholder tuple first, then update it to the correct
value, and the error only aborts the second part. That's why I needed
to add a WHERE clause to only count non-placeholder tuples.
I see that skink (buildfarm animal that runs under valgrind) has failed.
Gotta vacate the premises, will study later.
2025-11-04 16:44:46.271 CET [2118443][autovacuum worker][108/6:0] LOG: process 2118443 still waiting for ShareUpdateExclusiveLock on relation 16436 of database 5 after 1016.528 ms
2025-11-04 16:44:46.271 CET [2118443][autovacuum worker][108/6:0] DETAIL: Process holding the lock: 2118078. Wait queue: 2118443.
2025-11-04 16:44:46.271 CET [2118443][autovacuum worker][108/6:0] CONTEXT: waiting for ShareUpdateExclusiveLock on relation 16436 of database 5
2025-11-04 16:44:46.298 CET [2118078][autovacuum worker][103/9:766] ERROR: canceling autovacuum task
2025-11-04 16:44:46.298 CET [2118078][autovacuum worker][103/9:766] CONTEXT: automatic analyze of table "postgres.public.journal"
2025-11-04 16:44:46.382 CET [2118443][autovacuum worker][108/6:0] LOG: process 2118443 acquired ShareUpdateExclusiveLock on relation 16436 of database 5 after 1188.860 ms
2025-11-04 16:44:46.382 CET [2118443][autovacuum worker][108/6:0] CONTEXT: waiting for ShareUpdateExclusiveLock on relation 16436 of database 5
2025-11-04 16:44:46.975 CET [2118946][autovacuum worker][110/7:0] LOG: skipping analyze of "journal" --- lock not available
2025-11-04 16:44:47.490 CET [2118078][autovacuum worker][103/10:0] LOG: process 2118078 still waiting for ShareUpdateExclusiveLock on relation 16436 of database 5 after 1017.402 ms
2025-11-04 16:44:47.490 CET [2118078][autovacuum worker][103/10:0] DETAIL: Process holding the lock: 2118443. Wait queue: 2118078, 2118946.
2025-11-04 16:44:47.490 CET [2118078][autovacuum worker][103/10:0] CONTEXT: waiting for ShareUpdateExclusiveLock on relation 16436 of database 5
2025-11-04 16:44:47.792 CET [2118443][autovacuum worker][108/6:0] ERROR: canceling autovacuum task
2025-11-04 16:44:47.792 CET [2118443][autovacuum worker][108/6:0] CONTEXT: processing work entry for relation "postgres.public.brin_packdate_idx"
2025-11-04 16:44:47.810 CET [2118078][autovacuum worker][103/10:0] LOG: process 2118078 acquired ShareUpdateExclusiveLock on relation 16436 of database 5 after 1414.103 ms
2025-11-04 16:44:47.810 CET [2118078][autovacuum worker][103/10:0] CONTEXT: waiting for ShareUpdateExclusiveLock on relation 16436 of database 5
==2118443== VALGRINDERROR-BEGIN
==2118443== Invalid read of size 8
==2118443== at 0x4634F39: PopActiveSnapshot (snapmgr.c:777)
==2118443== by 0x43F693F: do_autovacuum (autovacuum.c:2561)
==2118443== by 0x43F6E2B: AutoVacWorkerMain (autovacuum.c:1604)
==2118443== by 0x43FA9C9: postmaster_child_launch (launch_backend.c:268)
==2118443== by 0x43FDD9E: StartChildProcess (postmaster.c:3991)
==2118443== by 0x43FE008: StartAutovacuumWorker (postmaster.c:4055)
==2118443== by 0x43FF078: process_pm_pmsignal (postmaster.c:3812)
==2118443== by 0x43FF93C: ServerLoop (postmaster.c:1706)
==2118443== by 0x4401080: PostmasterMain (postmaster.c:1403)
==2118443== by 0x432A55F: main (main.c:231)
==2118443== Address 0x10 is not stack'd, malloc'd or (recently) free'd
==2118443==
==2118443== VALGRINDERROR-END
{
<insert_a_suppression_name_here>
Memcheck:Addr8
fun:PopActiveSnapshot
fun:do_autovacuum
fun:AutoVacWorkerMain
fun:postmaster_child_launch
fun:StartChildProcess
fun:StartAutovacuumWorker
fun:process_pm_pmsignal
fun:ServerLoop
fun:PostmasterMain
fun:main
}
==2118443==
==2118443== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==2118443== Access not within mapped region at address 0x10
==2118443== at 0x4634F39: PopActiveSnapshot (snapmgr.c:777)
==2118443== by 0x43F693F: do_autovacuum (autovacuum.c:2561)
==2118443== by 0x43F6E2B: AutoVacWorkerMain (autovacuum.c:1604)
==2118443== by 0x43FA9C9: postmaster_child_launch (launch_backend.c:268)
==2118443== by 0x43FDD9E: StartChildProcess (postmaster.c:3991)
==2118443== by 0x43FE008: StartAutovacuumWorker (postmaster.c:4055)
==2118443== by 0x43FF078: process_pm_pmsignal (postmaster.c:3812)
==2118443== by 0x43FF93C: ServerLoop (postmaster.c:1706)
==2118443== by 0x4401080: PostmasterMain (postmaster.c:1403)
==2118443== by 0x432A55F: main (main.c:231)
==2118443== If you believe this happened as a result of a stack
==2118443== overflow in your program's main thread (unlikely but
==2118443== possible), you can try to increase the size of the
==2118443== main thread stack using the --main-stacksize= flag.
==2118443== The main thread stack size used in this run was 8388608.
2025-11-04 16:44:48.010 CET [2118946][autovacuum worker][110/8:0] LOG: process 2118946 still waiting for ShareUpdateExclusiveLock on relation 16436 of database 5 after 1013.910 ms
2025-11-04 16:44:48.010 CET [2118946][autovacuum worker][110/8:0] DETAIL: Process holding the lock: 2118078. Wait queue: 2118946.
2025-11-04 16:44:48.010 CET [2118946][autovacuum worker][110/8:0] CONTEXT: waiting for ShareUpdateExclusiveLock on relation 16436 of database 5
2025-11-04 16:44:48.098 CET [2118946][autovacuum worker][110/8:0] LOG: process 2118946 acquired ShareUpdateExclusiveLock on relation 16436 of database 5 after 1111.825 ms
2025-11-04 16:44:48.098 CET [2118946][autovacuum worker][110/8:0] CONTEXT: waiting for ShareUpdateExclusiveLock on relation 16436 of database 5
2025-11-04 16:44:48.482 CET [2119181][client backend][8/2:0] LOG: statement: select count(*) from brin_page_items(get_raw_page('brin_wi_idx', 2), 'brin_wi_idx'::regclass)
where not placeholder;
2025-11-04 16:44:48.799 CET [2109989][postmaster][:0] LOG: autovacuum worker (PID 2118443) was terminated by signal 11: Segmentation fault
2025-11-04 16:44:48.799 CET [2109989][postmaster][:0] DETAIL: Failed process was running: autovacuum: BRIN summarize public.brin_packdate_idx 1
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On 2025-Nov-04, Álvaro Herrera wrote:
2025-11-04 16:44:47.792 CET [2118443][autovacuum worker][108/6:0] ERROR: canceling autovacuum task
2025-11-04 16:44:47.792 CET [2118443][autovacuum worker][108/6:0] CONTEXT: processing work entry for relation "postgres.public.brin_packdate_idx"
2025-11-04 16:44:47.810 CET [2118078][autovacuum worker][103/10:0] LOG: process 2118078 acquired ShareUpdateExclusiveLock on relation 16436 of database 5 after 1414.103 ms
==2118443== VALGRINDERROR-BEGIN
==2118443== Invalid read of size 8
==2118443== at 0x4634F39: PopActiveSnapshot (snapmgr.c:777)
==2118443== by 0x43F693F: do_autovacuum (autovacuum.c:2561)
==2118443== by 0x43F6E2B: AutoVacWorkerMain (autovacuum.c:1604)
Ah, the problem is that perform_work_item() pushes an active snapshot
before calling the BRIN function, but because of the error, we
terminate the transaction before returning, so when we try to Pop that
snapshot, it doesn't exist anymore. Other ways for this to happen would
be an autovacuum worker trying to run an item for a dropped table (the
"goto deleted2" case). Hmm ...
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On 2025-Nov-04, Álvaro Herrera wrote:
Ah, the problem is that perform_work_item() pushes an active snapshot
before calling the BRIN function, but because of the error, we
terminate the transaction before returning, so when we try to Pop that
snapshot, it doesn't exist anymore. Other ways for this to happen would
be an autovacuum worker trying to run an item for a dropped table (the
"goto deleted2" case).
Of course, the answer is just to inquire ActiveSnapshotSet() beforehand.
I tested this by adding an elog(ERROR) in perform_work_item() -- without
this fix, I see a segfault, which disappears with it.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Debido a que la velocidad de la luz es mucho mayor que la del sonido,
algunas personas nos parecen brillantes un minuto antes
de escuchar las pelotudeces que dicen." (Roberto Fontanarrosa)
Attachments:
0001-fix-brin-failure.patchtext/x-diff; charset=utf-8Download
From 6558ceccdb96f37d3f1db8ad426ae02d1d29231c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=81lvaro=20Herrera?= <alvherre@kurilemu.de>
Date: Tue, 4 Nov 2025 18:32:18 +0100
Subject: [PATCH] fix brin failure
---
src/backend/postmaster/autovacuum.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 59ec45a4e96..ed19c74bb19 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2558,7 +2558,8 @@ deleted:
PushActiveSnapshot(GetTransactionSnapshot());
perform_work_item(workitem);
- PopActiveSnapshot();
+ if (ActiveSnapshotSet()) /* transaction could have aborted */
+ PopActiveSnapshot();
/*
* Check for config changes before acquiring lock for further jobs.
--
2.47.3