Unportable use of select for timeouts in PostgresNode.pm

Started by Andrew Dunstanover 8 years ago3 messages
#1Andrew Dunstan
andrew.dunstan@2ndquadrant.com

I've been trying to get to the bottom of a nasty hang in buildfarm
member jacana when running the pg_ctl TAP test. This test used to work,
and was last known to work on June 22nd.

My attention has become focussed on this change in commit de3de0afd:

    -       # Wait a second before retrying.
    -       sleep 1;
    +       # Wait 0.1 second before retrying.
    +       select undef, undef, undef, 0.1;

This is a usage that is known not to work in Windows - IIRC we
eliminated such calls from our C programs at the time of the Windows
port - and it seems to me very likely to be the cause of the hang.
Instead I think we should use the usleep() function from the standard
(from 5.8) Perl module Time::HiRes, as recommended in the Perl docs for
the sleep() function for situations where you need finer grained
timeouts. I have verified that this works on jacana and friends.

Unless I hear objections I'll prepare a patch along those lines.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Michael Paquier
michael.paquier@gmail.com
In reply to: Andrew Dunstan (#1)
Re: Unportable use of select for timeouts in PostgresNode.pm

On Mon, Jul 17, 2017 at 4:48 PM, Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

This is a usage that is known not to work in Windows - IIRC we
eliminated such calls from our C programs at the time of the Windows
port - and it seems to me very likely to be the cause of the hang.
Instead I think we should use the usleep() function from the standard
(from 5.8) Perl module Time::HiRes, as recommended in the Perl docs for
the sleep() function for situations where you need finer grained
timeouts. I have verified that this works on jacana and friends.

Looking at my boxes (Arch, Mac, Windows), Time::Hires looks to be part
of the core set of packages, so there is visibly no real need to
incorporate a check in configure.in. So +1 for doing as you suggest.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#1)
Re: Unportable use of select for timeouts in PostgresNode.pm

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I've been trying to get to the bottom of a nasty hang in buildfarm
member jacana when running the pg_ctl TAP test. This test used to work,
and was last known to work on June 22nd.

My attention has become focussed on this change in commit de3de0afd:

-       # Wait a second before retrying.
-       sleep 1;
+       # Wait 0.1 second before retrying.
+       select undef, undef, undef, 0.1;

This is a usage that is known not to work in Windows - IIRC we
eliminated such calls from our C programs at the time of the Windows
port - and it seems to me very likely to be the cause of the hang.

Ugh.

Instead I think we should use the usleep() function from the standard
(from 5.8) Perl module Time::HiRes, as recommended in the Perl docs for
the sleep() function for situations where you need finer grained
timeouts. I have verified that this works on jacana and friends.

Unless I hear objections I'll prepare a patch along those lines.

WFM. Thanks for taking care of it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers