log XLogPrefetch stats at end of recovery

Started by Lakshmi N2 months ago11 messageshackers

lakshmin.jhs@gmail.com

2 months ago

Hi,

While investigating a long recovery, I noticed that XLogPrefetch stats were
not logged at the end of recovery. This log message will be useful to
understand how effective XLogPrefetch was during recovery. Adding a patch
to address this.

Regards,
Lakshmi

SATYANARAYANA NARLAPURAM

satyanarlapuram@gmail.com

2 months ago

In reply to: Lakshmi N (#1)

Re: log XLogPrefetch stats at end of recovery

Hi,

On Wed, Mar 18, 2026 at 12:18 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

Hi,

While investigating a long recovery, I noticed that XLogPrefetch stats
were not logged at the end of recovery. This log message will be useful to
understand how effective XLogPrefetch was during recovery. Adding a patch
to address this.

Applied this patch and validated the log message. This log message appears
to be useful to me, particularly while doing fleet wide analysis. I am
wondering if we can periodically log this in standby mode as well, not just
before promoting?

2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats:
prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996

Thanks,
Satya

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

2 months ago

In reply to: SATYANARAYANA NARLAPURAM (#2)

Re: log XLogPrefetch stats at end of recovery

Hi,

On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

While investigating a long recovery, I noticed that XLogPrefetch stats were not logged at the end of recovery. This log message will be useful to understand how effective XLogPrefetch was during recovery. Adding a patch to address this.

Applied this patch and validated the log message. This log message appears to be useful to me, particularly while doing fleet wide analysis.

2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats: prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996

This looks useful to understand how the prefetch helped during long recoveries.

I am wondering if we can periodically log this in standby mode as well, not just before promoting?

Timer-based startup progress messaging allows logging such things
(ereport_startup_progress API). There was an attempt to enable "redo
in progress" for standbys, but that seemed to flood the standby logs
even at the default progress interval of 10 sec.

Having said that, the prefetcher stats could be added to the existing
ereport_startup_progress("redo in progress xxx") message that works
for crash recoveries—however, I don't prefer doing a bunch of atomic
reads every progress interval of 10 sec. Therefore, logging at the end
of recovery looks good to me.

I reviewed the patch. I have the following comment:

+ elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",

XLogPrefetcher is an internal data structure name, how about "redo
prefetch stats: xxxx" to be consistent with other redo log messages?

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

Jakub Wartak

jakub.wartak@enterprisedb.com

2 months ago

In reply to: Bharath Rupireddy (#3)

Re: log XLogPrefetch stats at end of recovery

On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

While investigating a long recovery, I noticed that XLogPrefetch stats were not logged at the end of recovery. This log message will be useful to understand how effective XLogPrefetch was during recovery. Adding a patch to address this.

Applied this patch and validated the log message. This log message appears to be useful to me, particularly while doing fleet wide analysis.

2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats: prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996

This looks useful to understand how the prefetch helped during long recoveries.

I am wondering if we can periodically log this in standby mode as well, not just before promoting?

Timer-based startup progress messaging allows logging such things
(ereport_startup_progress API). There was an attempt to enable "redo
in progress" for standbys, but that seemed to flood the standby logs
even at the default progress interval of 10 sec.

Having said that, the prefetcher stats could be added to the existing
ereport_startup_progress("redo in progress xxx") message that works
for crash recoveries—however, I don't prefer doing a bunch of atomic
reads every progress interval of 10 sec.

Therefore, logging at the end of recovery looks good to me.

+1 from me too to only of logging at the end of recovery (so -1 to logging
every now and then). If someone is interested in current state (or progress
over time) I think he can query pg_stat_recovery_prefetch view already, even
today, right?

I reviewed the patch. I have the following comment:

+ elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",

XLogPrefetcher is an internal data structure name, how about "redo
prefetch stats: xxxx" to be consistent with other redo log messages?

-J.

Lakshmi N

lakshmin.jhs@gmail.com

2 months ago

In reply to: Jakub Wartak (#4)

Re: log XLogPrefetch stats at end of recovery

Hi all,

Thank you for your feedback. Please see the attached patch.

On Mon, Mar 23, 2026 at 4:30 AM Jakub Wartak <jakub.wartak@enterprisedb.com>
wrote:

On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

While investigating a long recovery, I noticed that XLogPrefetch

stats were not logged at the end of recovery. This log message will be
useful to understand how effective XLogPrefetch was during recovery. Adding
a patch to address this.

Applied this patch and validated the log message. This log message

appears to be useful to me, particularly while doing fleet wide analysis.

2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats:

prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996

This looks useful to understand how the prefetch helped during long

recoveries.

I am wondering if we can periodically log this in standby mode as

well, not just before promoting?

Timer-based startup progress messaging allows logging such things
(ereport_startup_progress API). There was an attempt to enable "redo
in progress" for standbys, but that seemed to flood the standby logs
even at the default progress interval of 10 sec.

Having said that, the prefetcher stats could be added to the existing
ereport_startup_progress("redo in progress xxx") message that works
for crash recoveries—however, I don't prefer doing a bunch of atomic
reads every progress interval of 10 sec.

Therefore, logging at the end of recovery looks good to me.

+1 from me too to only of logging at the end of recovery (so -1 to logging
every now and then). If someone is interested in current state (or progress
over time) I think he can query pg_stat_recovery_prefetch view already,
even
today, right?

I reviewed the patch. I have the following comment:

+ elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",

XLogPrefetcher is an internal data structure name, how about "redo
prefetch stats: xxxx" to be consistent with other redo log messages?

+1

please find the attached patch addressing this.

Regards,
Lakshmi

Jakub Wartak

jakub.wartak@enterprisedb.com

2 months ago

In reply to: Lakshmi N (#5)

Re: log XLogPrefetch stats at end of recovery

On Tue, Mar 24, 2026 at 9:28 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

Hi all,

Thank you for your feedback. Please see the attached patch.

On Mon, Mar 23, 2026 at 4:30 AM Jakub Wartak <jakub.wartak@enterprisedb.com> wrote:

On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

While investigating a long recovery, I noticed that XLogPrefetch stats were not logged at the end of recovery. This log message will be useful to understand how effective XLogPrefetch was during recovery. Adding a patch to address this.

Applied this patch and validated the log message. This log message appears to be useful to me, particularly while doing fleet wide analysis.

2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats: prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996

This looks useful to understand how the prefetch helped during long recoveries.

I am wondering if we can periodically log this in standby mode as well, not just before promoting?

Timer-based startup progress messaging allows logging such things
(ereport_startup_progress API). There was an attempt to enable "redo
in progress" for standbys, but that seemed to flood the standby logs
even at the default progress interval of 10 sec.

Having said that, the prefetcher stats could be added to the existing
ereport_startup_progress("redo in progress xxx") message that works
for crash recoveries—however, I don't prefer doing a bunch of atomic
reads every progress interval of 10 sec.

Therefore, logging at the end of recovery looks good to me.

+1 from me too to only of logging at the end of recovery (so -1 to logging
every now and then). If someone is interested in current state (or progress
over time) I think he can query pg_stat_recovery_prefetch view already, even
today, right?

I reviewed the patch. I have the following comment:

+ elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",

XLogPrefetcher is an internal data structure name, how about "redo
prefetch stats: xxxx" to be consistent with other redo log messages?

+1

please find the attached patch addressing this.

Hi Lakshmi,

The pg_stat_get_recovery_prefetch view has 3 additional columns, but they are
showing current situation, maybe this $patch could be enhanced so that it also
calculates averages for some of them? I mean perhaps not just hit ratio would be
nice, but also the e.g. __average__ wal distance or average look ahead
(block distance).
I'm not sure maybe please also wait for input from others if they find
it a good idea
or not.

Also if we are adding such pretty cryptic fields as "skip_fpw" to log,
we would need some
place in documentaiton to explain their meaining probably.
monitoring.sgml already
explains them, so maybe just link to that. I mean if you look what we
have today (below),
they are little less crypting and have different format, not
"skip_fpw=123" but more
like "17 removed" so maybe "17 skipped due to FPW". Maybe

redo done at 0/75C7B290 system usage: CPU: user: 1.02 s, system: 0.51
s, elapsed: 1.53 s
checkpoint complete: end-of-recovery fast wait: wrote 16289 buffers
(99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 17 removed, 0
recycled; write=0.049 s, sync=0.191 s, total=0.264 s; sync files=10,
longest=0.187 s, average=0.020 s; distance=287186 kB, estimate=287186
kB; lsn=0/75C7B2C8, redo lsn=0/75C7B2C8

so instead of like:
redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu,
skip_new=%lu, skip_fpw=%lu, skip_rep=%lu"

something like below ones:
redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, ..
redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu
zero-initated, .. or something like that

-J.

Lakshmi N

lakshmin.jhs@gmail.com

2 months ago

In reply to: Jakub Wartak (#6)

Re: log XLogPrefetch stats at end of recovery

Hi,

On Tue, Mar 24, 2026 at 1:53 AM Jakub Wartak <jakub.wartak@enterprisedb.com>
wrote:

On Tue, Mar 24, 2026 at 9:28 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

Hi all,

Thank you for your feedback. Please see the attached patch.

On Mon, Mar 23, 2026 at 4:30 AM Jakub Wartak <

jakub.wartak@enterprisedb.com> wrote:

On Sun, Mar 22, 2026 at 1:43 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Sat, Mar 21, 2026 at 1:16 AM SATYANARAYANA NARLAPURAM
<satyanarlapuram@gmail.com> wrote:

While investigating a long recovery, I noticed that XLogPrefetch

stats were not logged at the end of recovery. This log message will be
useful to understand how effective XLogPrefetch was during recovery. Adding
a patch to address this.

Applied this patch and validated the log message. This log message

appears to be useful to me, particularly while doing fleet wide analysis.

2026-03-20 23:33:13.756 PDT [2265441] LOG: XLogPrefetcher stats:

prefetch=14, hit=6, skip_init=5, skip_new=28, skip_fpw=18, skip_rep=996

This looks useful to understand how the prefetch helped during long

recoveries.

I am wondering if we can periodically log this in standby mode as

well, not just before promoting?

Timer-based startup progress messaging allows logging such things
(ereport_startup_progress API). There was an attempt to enable "redo
in progress" for standbys, but that seemed to flood the standby logs
even at the default progress interval of 10 sec.

Having said that, the prefetcher stats could be added to the existing
ereport_startup_progress("redo in progress xxx") message that works
for crash recoveries—however, I don't prefer doing a bunch of atomic
reads every progress interval of 10 sec.

Therefore, logging at the end of recovery looks good to me.

+1 from me too to only of logging at the end of recovery (so -1 to

logging

every now and then). If someone is interested in current state (or

progress

over time) I think he can query pg_stat_recovery_prefetch view already,

even

today, right?

I reviewed the patch. I have the following comment:

+ elog(LOG, "XLogPrefetcher stats: prefetch=%lu, hit=%lu,
skip_init=%lu, skip_new=%lu, skip_fpw=%lu, skip_rep=%lu",

XLogPrefetcher is an internal data structure name, how about "redo
prefetch stats: xxxx" to be consistent with other redo log messages?

+1

please find the attached patch addressing this.

Hi Lakshmi,

The pg_stat_get_recovery_prefetch view has 3 additional columns, but they
are
showing current situation, maybe this $patch could be enhanced so that it
also
calculates averages for some of them? I mean perhaps not just hit ratio
would be
nice, but also the e.g. __average__ wal distance or average look ahead
(block distance).
I'm not sure maybe please also wait for input from others if they find
it a good idea
or not.

Also if we are adding such pretty cryptic fields as "skip_fpw" to log,
we would need some
place in documentaiton to explain their meaining probably.
monitoring.sgml already
explains them, so maybe just link to that. I mean if you look what we
have today (below),
they are little less crypting and have different format, not
"skip_fpw=123" but more
like "17 removed" so maybe "17 skipped due to FPW". Maybe

redo done at 0/75C7B290 system usage: CPU: user: 1.02 s, system: 0.51
s, elapsed: 1.53 s
checkpoint complete: end-of-recovery fast wait: wrote 16289 buffers
(99.4%), wrote 3 SLRU buffers; 0 WAL file(s) added, 17 removed, 0
recycled; write=0.049 s, sync=0.191 s, total=0.264 s; sync files=10,
longest=0.187 s, average=0.020 s; distance=287186 kB, estimate=287186
kB; lsn=0/75C7B2C8, redo lsn=0/75C7B2C8

so instead of like:
redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu,
skip_new=%lu, skip_fpw=%lu, skip_rep=%lu"

something like below ones:
redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, ..
redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu
zero-initated, .. or something like that

Please find the attached patch with the suggested changes. I referenced [1]https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-RECOVERY-PREFETCH
to log the message as suggested.

2026-03-24 04:53:15.251 PDT [18898] LOG: redo prefetch stats: prefetched
27 blocks, skipped 22 blocks because they were already in the buffer pool,
skipped 17 blocks because they would be zero-initialized, skipped 0 blocks
because they didn't exist yet, skipped 28 blocks because a full page image
was included in the WAL, skipped 155 blocks because they were already
recently prefetched.

[1]: https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-RECOVERY-PREFETCH
https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-RECOVERY-PREFETCH

Regards,
Lakshmi

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

2 months ago

In reply to: Lakshmi N (#7)

Re: log XLogPrefetch stats at end of recovery

Hi,

On Tue, Mar 24, 2026 at 5:07 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

so instead of like:
redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu,
skip_new=%lu, skip_fpw=%lu, skip_rep=%lu"

something like below ones:
redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, ..
redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu
zero-initated, .. or something like that

Please find the attached patch with the suggested changes. I referenced [1] to log the message as suggested.

2026-03-24 04:53:15.251 PDT [18898] LOG: redo prefetch stats: prefetched 27 blocks, skipped 22 blocks because they were already in the buffer pool, skipped 17 blocks because they would be zero-initialized, skipped 0 blocks because they didn't exist yet, skipped 28 blocks because a full page image was included in the WAL, skipped 155 blocks because they were already recently prefetched.

IMHO, the above looks too verbose. +1 for Jakub's suggestion. Would
something like the below work? I believe the developers looking at
these logs for analysis will have some understanding of what each of
these means.

LOG: redo prefetch stats: prefetched 27, skipped (22 in buffer pool,
17 zero-inited, 0 non-existent, 28 FPI, 155 recently prefetched)

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

Lakshmi N

lakshmin.jhs@gmail.com

2 months ago

In reply to: Bharath Rupireddy (#8)

Re: log XLogPrefetch stats at end of recovery

Hi,

On Tue, Mar 24, 2026 at 8:23 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Tue, Mar 24, 2026 at 5:07 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

so instead of like:
redo prefetch stats: prefetch=%lu, hit=%lu, skip_init=%lu,
skip_new=%lu, skip_fpw=%lu, skip_rep=%lu"

something like below ones:
redo prefetch stats: done %lu prefetches, %lu hit, %lu zero-initated, ..
redo prefetch stats: done %lu prefetches, (%d% hit ratio), %lu
zero-initated, .. or something like that

Please find the attached patch with the suggested changes. I referenced

[1] to log the message as suggested.

2026-03-24 04:53:15.251 PDT [18898] LOG: redo prefetch stats:

prefetched 27 blocks, skipped 22 blocks because they were already in the
buffer pool, skipped 17 blocks because they would be zero-initialized,
skipped 0 blocks because they didn't exist yet, skipped 28 blocks because a
full page image was included in the WAL, skipped 155 blocks because they
were already recently prefetched.

IMHO, the above looks too verbose. +1 for Jakub's suggestion. Would
something like the below work? I believe the developers looking at
these logs for analysis will have some understanding of what each of
these means.

LOG: redo prefetch stats: prefetched 27, skipped (22 in buffer pool,
17 zero-inited, 0 non-existent, 28 FPI, 155 recently prefetched)

Address this in the attached patch. Please let me know if this looks good.
2026-03-26 00:51:19.797 PDT [91881] LOG: redo prefetch stats: prefetched
418376 blocks, skipped (362419 in the buffer pool, 418479 zero-initialized,
0 non-existent, 9 full page image, 39183318 recently prefetched)

Regards,
Lakshmi

#10

Bharath Rupireddy

bharath.rupireddyforpostgres@gmail.com

about 2 months ago

In reply to: Lakshmi N (#9)

Re: log XLogPrefetch stats at end of recovery

Hi,

On Thu, Mar 26, 2026 at 1:06 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

Address this in the attached patch. Please let me know if this looks good.
2026-03-26 00:51:19.797 PDT [91881] LOG: redo prefetch stats: prefetched 418376 blocks, skipped (362419 in the buffer pool, 418479 zero-initialized, 0 non-existent, 9 full page image, 39183318 recently prefetched)

Thanks for sending the latest patch. It looks good to me with a
nit:s/362419 in the buffer pool/362419 already in buffer pool. I will
leave it to others' choice.

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

#11

SATYANARAYANA NARLAPURAM

satyanarlapuram@gmail.com

about 2 months ago

In reply to: Bharath Rupireddy (#10)

Re: log XLogPrefetch stats at end of recovery

On Thu, Mar 26, 2026 at 5:10 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

On Thu, Mar 26, 2026 at 1:06 AM Lakshmi N <lakshmin.jhs@gmail.com> wrote:

Address this in the attached patch. Please let me know if this looks

good.

2026-03-26 00:51:19.797 PDT [91881] LOG: redo prefetch stats:

prefetched 418376 blocks, skipped (362419 in the buffer pool, 418479
zero-initialized, 0 non-existent, 9 full page image, 39183318 recently
prefetched)

Thanks for sending the latest patch. It looks good to me with a
nit:s/362419 in the buffer pool/362419 already in buffer pool. I will
leave it to others' choice

LGTM too, I will leave it to the committer's choice.

Thanks,
Satya

log XLogPrefetch stats at end of recovery

Attachments:

Attachments:

Attachments:

Attachments: