Vectorize pg_visibility.pg_visibility_map_summary

Started by Matthias van de Meent20 days ago5 messages
#1Matthias van de Meent
boekewurm+postgres@gmail.com
1 attachment(s)

Hi,

Whilst working on fixing a bug in GiST and SP-GiST's index-only scan
systems, I noticed that pg_visibility is sometimes rather wasteful
with the APIs which it calls into; especially now that there are more
optimized APIs available.

Here's one small patch that makes it use the visibilitymap_count() API
for pg_visibility_map_summary(), replacing its own bespoke counting
mechanism with the primary implementation that has vectorized
optimizations, thus reducing the overhead of
pg_visibility_map_summary.

CC-ed to authors of 41c51f0c68, visibilitymap_count was optimized, and
this potential user wasn't notified of that.

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Attachments:

v1-0001-pg_visibility-Use-visibilitymap_count-instead-of-.patchapplication/octet-stream; name=v1-0001-pg_visibility-Use-visibilitymap_count-instead-of-.patchDownload
From 761ceb93ac3c339c02e2337c43ddd8edbf79d692 Mon Sep 17 00:00:00 2001
From: Matthias van de Meent <boekewurm+postgres@gmail.com>
Date: Fri, 19 Dec 2025 22:12:39 +0100
Subject: [PATCH v1] pg_visibility: Use visibilitymap_count instead of loop

This improves performance by a good margin by vectorizing
the counting operations.
---
 contrib/pg_visibility/pg_visibility.c | 31 +++++----------------------
 1 file changed, 5 insertions(+), 26 deletions(-)

diff --git a/contrib/pg_visibility/pg_visibility.c b/contrib/pg_visibility/pg_visibility.c
index 7046c1b5f8e..715f5cdd17c 100644
--- a/contrib/pg_visibility/pg_visibility.c
+++ b/contrib/pg_visibility/pg_visibility.c
@@ -270,11 +270,8 @@ pg_visibility_map_summary(PG_FUNCTION_ARGS)
 {
 	Oid			relid = PG_GETARG_OID(0);
 	Relation	rel;
-	BlockNumber nblocks;
-	BlockNumber blkno;
-	Buffer		vmbuffer = InvalidBuffer;
-	int64		all_visible = 0;
-	int64		all_frozen = 0;
+	BlockNumber all_visible = 0;
+	BlockNumber all_frozen = 0;
 	TupleDesc	tupdesc;
 	Datum		values[2];
 	bool		nulls[2] = {0};
@@ -284,33 +281,15 @@ pg_visibility_map_summary(PG_FUNCTION_ARGS)
 	/* Only some relkinds have a visibility map */
 	check_relation_relkind(rel);
 
-	nblocks = RelationGetNumberOfBlocks(rel);
-
-	for (blkno = 0; blkno < nblocks; ++blkno)
-	{
-		int32		mapbits;
-
-		/* Make sure we are interruptible. */
-		CHECK_FOR_INTERRUPTS();
-
-		/* Get map info. */
-		mapbits = (int32) visibilitymap_get_status(rel, blkno, &vmbuffer);
-		if ((mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0)
-			++all_visible;
-		if ((mapbits & VISIBILITYMAP_ALL_FROZEN) != 0)
-			++all_frozen;
-	}
+	visibilitymap_count(rel, &all_visible, &all_frozen);
 
-	/* Clean up. */
-	if (vmbuffer != InvalidBuffer)
-		ReleaseBuffer(vmbuffer);
 	relation_close(rel, AccessShareLock);
 
 	if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
 		elog(ERROR, "return type must be a row type");
 
-	values[0] = Int64GetDatum(all_visible);
-	values[1] = Int64GetDatum(all_frozen);
+	values[0] = Int64GetDatum((int64) all_visible);
+	values[1] = Int64GetDatum((int64) all_frozen);
 
 	PG_RETURN_DATUM(HeapTupleGetDatum(heap_form_tuple(tupdesc, values, nulls)));
 }
-- 
2.50.1 (Apple Git-155)

#2Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Matthias van de Meent (#1)
Re: Vectorize pg_visibility.pg_visibility_map_summary

Hi,

On Mon, Dec 22, 2025 at 1:28 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Hi,

Whilst working on fixing a bug in GiST and SP-GiST's index-only scan
systems, I noticed that pg_visibility is sometimes rather wasteful
with the APIs which it calls into; especially now that there are more
optimized APIs available.

Here's one small patch that makes it use the visibilitymap_count() API
for pg_visibility_map_summary(), replacing its own bespoke counting
mechanism with the primary implementation that has vectorized
optimizations, thus reducing the overhead of
pg_visibility_map_summary.

It looks like a reasonable idea as it also simplifies the
pg_visibility_map_summary() function. I'm going to push it, barring
any objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#3Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Masahiko Sawada (#2)
Re: Vectorize pg_visibility.pg_visibility_map_summary

Hi,

On Mon, 22 Dec 2025 at 23:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Dec 22, 2025 at 1:28 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Here's one small patch that makes it use the visibilitymap_count() API
for pg_visibility_map_summary(), replacing its own bespoke counting
mechanism with the primary implementation that has vectorized
optimizations, thus reducing the overhead of
pg_visibility_map_summary.

It looks like a reasonable idea as it also simplifies the
pg_visibility_map_summary() function. I'm going to push it, barring
any objections.

Obviously no objections from me, and, thanks!

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

#4wenhui qiu
qiuwenhuifx@gmail.com
In reply to: Matthias van de Meent (#3)
Re: Vectorize pg_visibility.pg_visibility_map_summary

Hi

It looks like a reasonable idea as it also simplifies the
pg_visibility_map_summary() function. I'm going to push it, barring
any objections.

Obviously no objections, Using visibilitymap_count() simplifies the code
and improves performance, with no behavior change.

Thanks

On Tue, Dec 23, 2025 at 6:17 AM Matthias van de Meent <
boekewurm+postgres@gmail.com> wrote:

Show quoted text

Hi,

On Mon, 22 Dec 2025 at 23:04, Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Mon, Dec 22, 2025 at 1:28 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Here's one small patch that makes it use the visibilitymap_count() API
for pg_visibility_map_summary(), replacing its own bespoke counting
mechanism with the primary implementation that has vectorized
optimizations, thus reducing the overhead of
pg_visibility_map_summary.

It looks like a reasonable idea as it also simplifies the
pg_visibility_map_summary() function. I'm going to push it, barring
any objections.

Obviously no objections from me, and, thanks!

Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: wenhui qiu (#4)
Re: Vectorize pg_visibility.pg_visibility_map_summary

On Mon, Dec 22, 2025 at 8:03 PM wenhui qiu <qiuwenhuifx@gmail.com> wrote:

Hi

It looks like a reasonable idea as it also simplifies the
pg_visibility_map_summary() function. I'm going to push it, barring
any objections.

Obviously no objections, Using visibilitymap_count() simplifies the code and improves performance, with no behavior change.

Thanks

On Tue, Dec 23, 2025 at 6:17 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:

Hi,

On Mon, 22 Dec 2025 at 23:04, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Mon, Dec 22, 2025 at 1:28 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

Here's one small patch that makes it use the visibilitymap_count() API
for pg_visibility_map_summary(), replacing its own bespoke counting
mechanism with the primary implementation that has vectorized
optimizations, thus reducing the overhead of
pg_visibility_map_summary.

It looks like a reasonable idea as it also simplifies the
pg_visibility_map_summary() function. I'm going to push it, barring
any objections.

Obviously no objections from me, and, thanks!

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com