Failure in TAP tests of pg_ctl on Windows with parallel instance set

Started by Michael Paquierabout 6 years ago3 messages
#1Michael Paquier
michael@paquier.xyz
1 attachment(s)

Hi all,

I have run the TAP tests with an instance of Postgres locally set at
port 5432 on Windows, to notice that 001_start_stop.pl fails various
tests because the test tries to use the default port for the node
initialized with pg_ctl. The problem can get fixed easily by
assigning a random port number to that instance.

It could potentially become a problem if parallel TAP tests run in
parallel on Windows while initializing the node because of a port
conflict, but that's actually already a problem now for all the tests
as all nodes listen to 127.0.0.1 in this case. This cannot happen on
*nix simply because we use a unique unix domain path, so even if ports
conflict things are able to work.

Attached is a patch to fix this issue, that I would like to
back-patch down to 9.4 where the issue can show up.

Any objections?
--
Michael

Attachments:

pgctl-tap-win32.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/bin/pg_ctl/t/001_start_stop.pl b/src/bin/pg_ctl/t/001_start_stop.pl
index e5d46a6f25..6a1619e171 100644
--- a/src/bin/pg_ctl/t/001_start_stop.pl
+++ b/src/bin/pg_ctl/t/001_start_stop.pl
@@ -22,8 +22,10 @@ command_ok([ 'pg_ctl', 'initdb', '-D', "$tempdir/data", '-o', '-N' ],
 	'pg_ctl initdb');
 command_ok([ $ENV{PG_REGRESS}, '--config-auth', "$tempdir/data" ],
 	'configure authentication');
+my $node_port = get_free_port();
 open my $conf, '>>', "$tempdir/data/postgresql.conf";
 print $conf "fsync = off\n";
+print $conf "port = $node_port\n";
 print $conf TestLib::slurp_file($ENV{TEMP_CONFIG})
   if defined $ENV{TEMP_CONFIG};
 
#2Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Michael Paquier (#1)
Re: Failure in TAP tests of pg_ctl on Windows with parallel instance set

On 12/1/19 10:14 PM, Michael Paquier wrote:

Hi all,

I have run the TAP tests with an instance of Postgres locally set at
port 5432 on Windows, to notice that 001_start_stop.pl fails various
tests because the test tries to use the default port for the node
initialized with pg_ctl. The problem can get fixed easily by
assigning a random port number to that instance.

It could potentially become a problem if parallel TAP tests run in
parallel on Windows while initializing the node because of a port
conflict, but that's actually already a problem now for all the tests
as all nodes listen to 127.0.0.1 in this case. This cannot happen on
*nix simply because we use a unique unix domain path, so even if ports
conflict things are able to work.

Attached is a patch to fix this issue, that I would like to
back-patch down to 9.4 where the issue can show up.

Any objections?

Looks reasonable. I wonder if there are other test sets where we need to
set the port.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Michael Paquier
michael@paquier.xyz
In reply to: Andrew Dunstan (#2)
Re: Failure in TAP tests of pg_ctl on Windows with parallel instance set

On Mon, Dec 02, 2019 at 07:57:31AM -0500, Andrew Dunstan wrote:

Looks reasonable.

Thanks, committed and back-patched down to 11, which is where we have
PostgresNode::get_free_port. This could go further down with more
refactoring of PostgresNode.pm but as it took a long time to find this
issue that does not seem really worth the extra legwork.

I wonder if there are other test sets where we need to set the port.

I looked at that before sending the first email, with the tests of
initdb and pg_basebackup potentially breaking stuff, but we never
initialize (direct initdb) and then start a node without
PostgresNode.pm. So we are fine as far as I saw.
--
Michael