Use streaming read I/O in BRIN vacuuming

Started by Arseniy Mukhin8 months ago10 messageshackers
Jump to latest
#1Arseniy Mukhin
arseniy.mukhin.dev@gmail.com

Hi,

B-tree, GiST and SP-GiST take advantage of the read stream API during
vacuuming. BRIN vacuum seems like a perfect candidate for it as well.
During a vacuum it reads the entire relation sequentially, page by
page.

PFA the patch that migrates BRIN vacuum to the read stream API.

Best regards,
Arseniy Mukhin

Attachments:

v1-0001-Use-streaming-read-I-O-in-BRIN-vacuuming.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Use-streaming-read-I-O-in-BRIN-vacuuming.patchDownload+24-10
#2Andrey Borodin
amborodin@acm.org
In reply to: Arseniy Mukhin (#1)
Re: Use streaming read I/O in BRIN vacuuming

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...
2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

Best regards, Andrey Borodin.

#3Arseniy Mukhin
arseniy.mukhin.dev@gmail.com
In reply to: Andrey Borodin (#2)
Re: Use streaming read I/O in BRIN vacuuming

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]/messages/by-id/CAE7r3M+P1Qcv7CYxi=Jw_d40L=sVRAxdDti-Vo4X5x0opZ3XVw@mail.gmail.com...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

[0]: /messages/by-id/CAE7r3M+P1Qcv7CYxi=Jw_d40L=sVRAxdDti-Vo4X5x0opZ3XVw@mail.gmail.com

Best regards,
Arseniy Mukhin

#4Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Arseniy Mukhin (#3)
Re: Use streaming read I/O in BRIN vacuuming

On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

---
Have you done any performance testing with this patch? Since BRIN
indexes typically don't grow very large, I'm curious how much benefit
the read_stream provides for BRIN index cleanup.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#5Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Masahiko Sawada (#4)
Re: Use streaming read I/O in BRIN vacuuming

Hi,

On Thu, 9 Oct 2025 at 02:03, Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

Andres has some comments about retiring the READ_STREAM_SEQUENTIAL
flag as it is automatically detected and can potentially hurt
performance [1]/messages/by-id/go7c2sudqg4pp7dsabsak4ajugti4f3tyqoweja5ihcjiw65dc@yh6wi4zqys23.

[1]: /messages/by-id/go7c2sudqg4pp7dsabsak4ajugti4f3tyqoweja5ihcjiw65dc@yh6wi4zqys23

--
Regards,
Nazir Bilal Yavuz
Microsoft

#6Arseniy Mukhin
arseniy.mukhin.dev@gmail.com
In reply to: Nazir Bilal Yavuz (#5)
Re: Use streaming read I/O in BRIN vacuuming

Hi,

On Thu, Oct 9, 2025 at 2:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

Thank you for the review!

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

Good point, thank you. I looked again at the usage of the
READ_STREAM_SEQUENTIAL flag, and I'm leaning toward not using it here.
But I'm not completely sure, so I decided to ask about the flag usage
in the thread [0]/messages/by-id/CAE7r3MJ5NHb1BMo1oCNWmfrG=Ytx-GKX44YNdA21FuQQQeq_Qg@mail.gmail.com.

---
Have you done any performance testing with this patch? Since BRIN
indexes typically don't grow very large, I'm curious how much benefit
the read_stream provides for BRIN index cleanup.

Yeah, BRIN is known as a compact index, so these changes should have
less impact, but I thought it should still be useful, just on a
smaller scale. I didn't run any measurements, as the changes are
similar to those made to other index AMs, where this was considered an
improvement. But after your question, I decided to try running some
benchmarks.

------------------------------------------
Performance testing
------------------------------------------

Setup & data (fill_data.sql)

shared_buffers = 1Gb
Table size: ~970Mb
Index size: ~30Mb (BRIN index artificially was made quite big, but it
lets us see patch impact better).

Test cases

1) Cold data - restart db, drop page cache, table seqscan, run vacuum.
This way we read index data from the disk (restart_with_cache_drop.sh)
2) Page cache - restart db, table seqscan, run vacuum. This way we
read index data from the page cache (restart_with_wo_cache_drop.sh)
3) Shared buffers - run vacuum with data in shared buffers (use query
with bitmap scan to warm cache) (vacuum_cached.sh)

Every io_method was used with every test case. I also tried to remove
READ_STREAM_SEQUENTIAL and measured it too (it's called
patched_wo_seq).

What were measured:

1) Duration of brin_vacuum_scan (without FreeSpaceMapVacuum(idxrel))
2) Avg read rate from vacuum report. Index has 4260 pages, and the
vacuum report says 4300 pages were read, so it looks like almost all
read rate is from brin_vacuum_scan().

Results

--- Cold data ---
Method     | Version         |   Duration (ms) |    Throughput (MB/s)
----------------------------------------------------------------------
io_uring   | master             |      14.6 ± 0.8 |       2068.1 ± 127.6
              | patched              |       9.9 ± 1.3 |       2947.3 ± 363.0
             | patched_wo_seq  |      10.5 ± 1.5 |    2817.5 ± 359.7
sync       | master                |      16.1 ± 1.9 |    1921.6 ± 211.6
              | patched               |      15.4 ± 1.6 |    1985.9 ± 182.3
              | patched_wo_seq  |      13.9 ± 1.2 |    2155.7 ± 180.4
worker     | master               |      16.4 ± 1.6 |     1850.0 ± 168.6
              | patched               |       9.2 ± 0.5 |       3077.3 ± 154.3
              | patched_wo_seq  |       9.0 ± 0.5 |     3133.3 ± 158.1
--- Page cache ---

Method | Version | Duration (ms) | Throughput (MB/s)
----------------------------------------------------------------------
io_uring | master | 10.4 ± 1.7 | 2971.3 ± 469.6
| patched | 7.3 ± 1.7 | 4152.7 ± 910.6
| patched_wo_seq | 7.0 ± 1.4 | 4208.0 ± 742.2
sync | master | 10.4 ± 2.3 | 3037.9 ± 655.6
| patched | 9.2 ± 1.5 | 3319.8 ± 646.7
| patched_wo_seq | 9.1 ± 1.6 | 3384.7 ± 674.9
worker | master | 10.2 ± 1.7 | 3013.4 ± 490.0
| patched | 3.0 ± 0.2 | 7878.6 ± 540.3
| patched_wo_seq | 3.0 ± 0.3 | 8127.3 ± 1093.7

--- Shared buffers ---

Method | Version | Duration (ms) | Throughput (MB/s)
----------------------------------------------------------------------
io_uring | master | 3.1 ± 0.3 | 0.0 ± 0.0
| patched | 3.2 ± 0.5 | 0.0 ± 0.0
| patched_wo_seq | 3.3 ± 0.4 | 0.0 ± 0.0
sync | master | 3.1 ± 0.4 | 0.0 ± 0.0
| patched | 3.3 ± 0.5 | 0.0 ± 0.0
| patched_wo_seq | 3.1 ± 0.5 | 0.0 ± 0.0
worker | master | 3.2 ± 0.4 | 0.0 ± 0.0
| patched | 3.4 ± 0.3 | 0.0 ± 0.0
| patched_wo_seq | 3.6 ± 0.3 | 0.0 ± 0.0

Looks like we have a win for 'cold data' and 'page cache' cases with
worker and io_uring, and some modest improvement for sync.
READ_STREAM_SEQUENTIAL doesn't seem to change anything. I don't have
much benchmarking experience, so hope I didn't make any bad
mistakes.I'm new to benchmarking, so I hope I haven't made any serious
mistakes.

Thank you!

[0]: /messages/by-id/CAE7r3MJ5NHb1BMo1oCNWmfrG=Ytx-GKX44YNdA21FuQQQeq_Qg@mail.gmail.com

Best regards,
Arseniy Mukhin

Attachments:

restart_with_wo_cache_drop.shapplication/x-shellscript; name=restart_with_wo_cache_drop.shDownload
restart_with_cache_drop.shapplication/x-shellscript; name=restart_with_cache_drop.shDownload
vacuum_cached.shapplication/x-shellscript; name=vacuum_cached.shDownload
fill_data.sqlapplication/sql; name=fill_data.sqlDownload
#7Arseniy Mukhin
arseniy.mukhin.dev@gmail.com
In reply to: Arseniy Mukhin (#6)
Re: Use streaming read I/O in BRIN vacuuming

Hi,

On Mon, Oct 13, 2025 at 5:54 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Thu, Oct 9, 2025 at 2:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

Thank you for the review!

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

Good point, thank you. I looked again at the usage of the
READ_STREAM_SEQUENTIAL flag, and I'm leaning toward not using it here.
But I'm not completely sure, so I decided to ask about the flag usage
in the thread [0].

I removed READ_STREAM_SEQUENTIAL. The comment around
READ_STREAM_SEQUENTIAL says it should be used for cases where
sequential access might not be correctly detected. We use
block_range_read_stream_cb here, so the pattern should be clear. PFA
the new version.

Best regards,
Arseniy Mukhin

Attachments:

v2-0001-Use-streaming-read-I-O-in-BRIN-vacuuming.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Use-streaming-read-I-O-in-BRIN-vacuuming.patchDownload+23-10
#8Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Arseniy Mukhin (#7)
Re: Use streaming read I/O in BRIN vacuuming

On Wed, Nov 12, 2025 at 11:57 AM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Mon, Oct 13, 2025 at 5:54 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Thu, Oct 9, 2025 at 2:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

Thank you for the review!

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

Good point, thank you. I looked again at the usage of the
READ_STREAM_SEQUENTIAL flag, and I'm leaning toward not using it here.
But I'm not completely sure, so I decided to ask about the flag usage
in the thread [0].

I removed READ_STREAM_SEQUENTIAL. The comment around
READ_STREAM_SEQUENTIAL says it should be used for cases where
sequential access might not be correctly detected. We use
block_range_read_stream_cb here, so the pattern should be clear. PFA
the new version.

Thank you for updating the patch and sharing the performance test
results! The patch looks good to me. I'm going to push it, barring any
objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#9Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#8)
Re: Use streaming read I/O in BRIN vacuuming

On Thu, Nov 13, 2025 at 4:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Nov 12, 2025 at 11:57 AM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Mon, Oct 13, 2025 at 5:54 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Thu, Oct 9, 2025 at 2:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

Thank you for the review!

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

Good point, thank you. I looked again at the usage of the
READ_STREAM_SEQUENTIAL flag, and I'm leaning toward not using it here.
But I'm not completely sure, so I decided to ask about the flag usage
in the thread [0].

I removed READ_STREAM_SEQUENTIAL. The comment around
READ_STREAM_SEQUENTIAL says it should be used for cases where
sequential access might not be correctly detected. We use
block_range_read_stream_cb here, so the pattern should be clear. PFA
the new version.

Thank you for updating the patch and sharing the performance test
results! The patch looks good to me. I'm going to push it, barring any
objections.

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#10Arseniy Mukhin
arseniy.mukhin.dev@gmail.com
In reply to: Masahiko Sawada (#9)
Re: Use streaming read I/O in BRIN vacuuming

On Tue, Nov 18, 2025 at 3:20 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Nov 13, 2025 at 4:47 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Nov 12, 2025 at 11:57 AM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Mon, Oct 13, 2025 at 5:54 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi,

On Thu, Oct 9, 2025 at 2:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sun, Aug 31, 2025 at 12:48 PM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

Hi!

On Sun, Aug 31, 2025 at 8:49 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

Hi!

On 31 Aug 2025, at 21:17, Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> wrote:

PFA the patch that migrates BRIN vacuum to the read stream API.

The patch is nice and straightforward. Looks good to me.

Thank you for the review!

Some notes that do not seem to me problem of this patch:
1. This comment is copied 7 times already across codebase.
"It is safe to use batchmode as block_range_read_stream_cb"
Maybe we can refactor comments and function names...

Yes, I had similar thoughts, but having these comments at callsites
has its own benefits, there is a thread about these comments [0]...

2. Somehow brin_vacuum_scan() avoid dance of getting RelationGetNumberOfBlocks() many times to be entirely sure everything is scanned. Unlike other index vacuums, see btvacuumscan() for example.

If I understand correctly, in other access methods you need to be sure
that you read the relation up to the end, so you don't leave any index
tuples that should be pruned. BRIN doesn't have a prune phase, there
is only a cleanup phase. So it seems it's not a big deal if you miss
several pages that were allocated during the vacuum.

Thank you for proposing the patch! I've reviewed the patch and have
some comments:

Thank you for the review!

+   stream = read_stream_begin_relation(READ_STREAM_MAINTENANCE |
+                                       READ_STREAM_FULL |
+                                       READ_STREAM_SEQUENTIAL |
+                                       READ_STREAM_USE_BATCHING,
+                                       strategy,
+                                       idxrel,
+                                       MAIN_FORKNUM,
+                                       block_range_read_stream_cb,
+                                       &p,
+                                       0);

Unlike other index AM's it uses READ_STREAM_SEQUENTIAL. If there are
some reasons to use it, we should leave comments there.

Good point, thank you. I looked again at the usage of the
READ_STREAM_SEQUENTIAL flag, and I'm leaning toward not using it here.
But I'm not completely sure, so I decided to ask about the flag usage
in the thread [0].

I removed READ_STREAM_SEQUENTIAL. The comment around
READ_STREAM_SEQUENTIAL says it should be used for cases where
sequential access might not be correctly detected. We use
block_range_read_stream_cb here, so the pattern should be clear. PFA
the new version.

Thank you for updating the patch and sharing the performance test
results! The patch looks good to me. I'm going to push it, barring any
objections.

Pushed.

Thank you for reviewing and pushing!

Best regards,
Arseniy Mukhin