test: avoid redundant standby catchup in 049_wait_for_lsn

Started by Xuneng Zhou27 days ago7 messageshackers
Jump to latest
#1Xuneng Zhou
xunengzhou@gmail.com

Hi Alexander, Hackers,

While working on adding more edge-case tests and fixing the timeline
handling for WAIT FOR LSN, I noticed that the overall runtime of the
test had increased by about 7 seconds since a8b61c23c5ff. I looked
into the slowdown and found a potential source.

Currently, the test creates the function, waits for the standby to
catch up, tests it, then creates the procedure and waits for the
standby to catch up again. Since both objects are only used by the
same block of top-level statement checks, we can create them together
in a single primary-side transaction and perform just one
wait_for_catchup() before running both standby-side calls.

This small TAP cleanup merges the creation of the PL/pgSQL wrapper
function and procedure used for the top-level WAIT FOR checks in
049_wait_for_lsn.pl.

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Patch attached.

Thanks.
--
Best,
Xuneng

Attachments:

v1-0001-test-merge-wrapper-DDL-and-catchup-in-wait_for_ls.patchapplication/x-patch; name=v1-0001-test-merge-wrapper-DDL-and-catchup-in-wait_for_ls.patchDownload+7-12
#2Michael Paquier
michael@paquier.xyz
In reply to: Xuneng Zhou (#1)
Re: test: avoid redundant standby catchup in 049_wait_for_lsn

On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.
--
Michael

#3Xuneng Zhou
xunengzhou@gmail.com
In reply to: Michael Paquier (#2)
Re: test: avoid redundant standby catchup in 049_wait_for_lsn

Hi Michael,

On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.

Thank you for the review. I agree that the two wait checks serve distinct
purposes and are not redundant. The main motivation for this patch was
efficiency. In my testing, the new test added approximately 7 seconds to
the runtime, while the creation of the procedure and function completed
quickly. I suspect the latency stems from the wait-for-catch-up step. When
I removed it, the test runtime dropped by about 7 seconds.I haven't yet
investigated why the wait is so costly in this case. I should probably look
into that before proposing this change.

Best,
Xuneng

Show quoted text
#4Alexander Korotkov
aekorotkov@gmail.com
In reply to: Xuneng Zhou (#3)
Re: test: avoid redundant standby catchup in 049_wait_for_lsn

Hi, Xuneng.

On Sat, Apr 18, 2026 at 7:20 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:

On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.

Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The main motivation for this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, while the creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-up step. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why the wait is so costly in this case. I should probably look into that before proposing this change.

On my laptop the time needed to run t/049_wait_for_lsn.pl also drops
from 20 secs to 12 secs. The influence to the runtime of the whole
test suite in parallel would be not that big as CPU time only drops
from 2.16 sec to 2.07 sec. But anyway that's pretty significant.
I've revised comment message a bit and surrounding comments. I'm
going to push this if no objections.

------
Regards,
Alexander Korotkov
Supabase

Attachments:

v2-0001-049_wait_for_lsn.pl-create-function-and-procedure.patchapplication/octet-stream; name=v2-0001-049_wait_for_lsn.pl-create-function-and-procedure.patchDownload+11-14
#5Alexander Korotkov
aekorotkov@gmail.com
In reply to: Michael Paquier (#2)
Re: test: avoid redundant standby catchup in 049_wait_for_lsn

Hi, Michael!

On Sat, Apr 18, 2026 at 12:47 AM Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.

Thank you for your observation. The intention of this test is to
check explicit calls to WAIT FOR LSN. Yes, wait_for_catchup() now
also internally calls WAIT FOR LSN. But checking wait_for_catchup()
is not intention of this test, it's used in awfully a lot of other
places.

------
Regards,
Alexander Korotkov
Supabase

#6Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#4)
Re: test: avoid redundant standby catchup in 049_wait_for_lsn

On Sat, Apr 18, 2026 at 10:58 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:

On Sat, Apr 18, 2026 at 7:20 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:

On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.

Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The main motivation for this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, while the creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-up step. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why the wait is so costly in this case. I should probably look into that before proposing this change.

On my laptop the time needed to run t/049_wait_for_lsn.pl also drops
from 20 secs to 12 secs. The influence to the runtime of the whole
test suite in parallel would be not that big as CPU time only drops
from 2.16 sec to 2.07 sec. But anyway that's pretty significant.
I've revised comment message a bit and surrounding comments. I'm
going to push this if no objections.

Pushed.

------
Regards,
Alexander Korotkov
Supabase

#7Xuneng Zhou
xunengzhou@gmail.com
In reply to: Alexander Korotkov (#6)
Re: test: avoid redundant standby catchup in 049_wait_for_lsn

On Mon, Apr 20, 2026 at 6:21 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Sat, Apr 18, 2026 at 10:58 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:

On Sat, Apr 18, 2026 at 7:20 AM Xuneng Zhou <xunengzhou@gmail.com> wrote:

On Fri, Apr 17, 2026 at 08:25:35PM +0800, Xuneng Zhou wrote:

The change preserves the same coverage while removing one redundant
replay catch-up on the delayed standby. It appears to reduce the test
runtime by about 7 seconds, though I have looked into why much of the
improvement comes from this change alone.

Alexander may think differently and remove that, but I disagree. The
test is clearly written so as we want two wait checks to happen, for
for CREATE FUNCTION, and one for CREATE PROCEDURE. Removing the first
check to keep only the second one removes its meaning. In short, I
see nothing wrong to deal with here.

Thank you for the review. I agree that the two wait checks serve distinct purposes and are not redundant. The main motivation for this patch was efficiency. In my testing, the new test added approximately 7 seconds to the runtime, while the creation of the procedure and function completed quickly. I suspect the latency stems from the wait-for-catch-up step. When I removed it, the test runtime dropped by about 7 seconds.I haven't yet investigated why the wait is so costly in this case. I should probably look into that before proposing this change.

On my laptop the time needed to run t/049_wait_for_lsn.pl also drops
from 20 secs to 12 secs. The influence to the runtime of the whole
test suite in parallel would be not that big as CPU time only drops
from 2.16 sec to 2.07 sec. But anyway that's pretty significant.
I've revised comment message a bit and surrounding comments. I'm
going to push this if no objections.

Pushed.

Thanks for pushing it. I haven't had time to investigate the latency
yet, but will do it later.

Best,
Xuneng