refactor subscription tests to use PostgresNode's wait_for_catchup

Started by Peter Eisentrautover 8 years ago5 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

It appears that we have unwittingly created some duplicate and
copy-and-paste-prone code in src/test/subscription/ to wait for a
replication subscriber to catch up, when we already have
almost-sufficient code in PostgresNode to do that more compactly. So I
propose this patch to consolidate that.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Refactor-subscription-tests-to-use-PostgresNode-s-wa.patchtext/plain; charset=UTF-8; name=0001-Refactor-subscription-tests-to-use-PostgresNode-s-wa.patch; x-mac-creator=0; x-mac-type=0Download+39-97
#2Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#1)
Re: refactor subscription tests to use PostgresNode's wait_for_catchup

On Mon, Jan 08, 2018 at 09:46:21PM -0500, Peter Eisentraut wrote:

It appears that we have unwittingly created some duplicate and
copy-and-paste-prone code in src/test/subscription/ to wait for a
replication subscriber to catch up, when we already have
almost-sufficient code in PostgresNode to do that more compactly. So I
propose this patch to consolidate that.

This looks sane to me. I have two comments while I read the
surroundings.

@@ -1505,7 +1515,7 @@ sub wait_for_catchup
. $target_lsn . " on "
. $self->name . "\n";
my $query =
-qq[SELECT '$target_lsn' <= ${mode}_lsn FROM pg_catalog.pg_stat_replication WHERE application_name = '$standby_name';];
+qq[SELECT $lsn_expr <= ${mode}_lsn FROM pg_catalog.pg_stat_replication WHERE application_name = '$standby_name';];
$self->poll_query_until('postgres', $query)
or die "timed out waiting for catchup, current location is "
. ($self->safe_psql('postgres', $query) || '(unknown)');

This log is wrong from the beginning. Here $query returns a boolean
status and not a location. I think that when the poll dies because of a
timeout you should do a lookup at ${mode}_lsn from pg_stat_replication
when application_name matching $standby_name. Could you fix that as
well?

Could you also update promote_standby in RewindTest.pm? Your refactoring
to use pg_current_wal_lsn() if a target_lsn is not possible makes this
move possible. Using the generic APIs gives better logs as well.
--
Michael

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#2)
Re: refactor subscription tests to use PostgresNode's wait_for_catchup

On 1/8/18 23:47, Michael Paquier wrote:

@@ -1505,7 +1515,7 @@ sub wait_for_catchup
. $target_lsn . " on "
. $self->name . "\n";
my $query =
-qq[SELECT '$target_lsn' <= ${mode}_lsn FROM pg_catalog.pg_stat_replication WHERE application_name = '$standby_name';];
+qq[SELECT $lsn_expr <= ${mode}_lsn FROM pg_catalog.pg_stat_replication WHERE application_name = '$standby_name';];
$self->poll_query_until('postgres', $query)
or die "timed out waiting for catchup, current location is "
. ($self->safe_psql('postgres', $query) || '(unknown)');

This log is wrong from the beginning. Here $query returns a boolean
status and not a location. I think that when the poll dies because of a
timeout you should do a lookup at ${mode}_lsn from pg_stat_replication
when application_name matching $standby_name. Could you fix that as
well?

Should we just remove it? Apparently, it was never functional to begin
with. Otherwise, we'd have to write a second query to return the value
to print. wait_for_slot_catchup has the same issue. Seems like a lot
of overhead for something that has never been used.

Could you also update promote_standby in RewindTest.pm? Your refactoring
to use pg_current_wal_lsn() if a target_lsn is not possible makes this
move possible. Using the generic APIs gives better logs as well.

Right.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#3)
Re: refactor subscription tests to use PostgresNode's wait_for_catchup

On Wed, Jan 10, 2018 at 09:45:56PM -0500, Peter Eisentraut wrote:

On 1/8/18 23:47, Michael Paquier wrote:
Should we just remove it? Apparently, it was never functional to begin
with. Otherwise, we'd have to write a second query to return the value
to print. wait_for_slot_catchup has the same issue. Seems like a lot
of overhead for something that has never been used.

Fine for me to remove it.
--
Michael

#5Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#4)
Re: refactor subscription tests to use PostgresNode's wait_for_catchup

On 1/10/18 22:24, Michael Paquier wrote:

On Wed, Jan 10, 2018 at 09:45:56PM -0500, Peter Eisentraut wrote:

On 1/8/18 23:47, Michael Paquier wrote:
Should we just remove it? Apparently, it was never functional to begin
with. Otherwise, we'd have to write a second query to return the value
to print. wait_for_slot_catchup has the same issue. Seems like a lot
of overhead for something that has never been used.

committed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services