Thread: [HACKERS] Unportable use of select for timeouts in PostgresNode.pm
I've been trying to get to the bottom of a nasty hang in buildfarm member jacana when running the pg_ctl TAP test. This test used to work, and was last known to work on June 22nd. My attention has become focussed on this change in commit de3de0afd: - # Wait a second before retrying. - sleep 1; + # Wait 0.1 second before retrying. + selectundef, undef, undef, 0.1; This is a usage that is known not to work in Windows - IIRC we eliminated such calls from our C programs at the time of the Windows port - and it seems to me very likely to be the cause of the hang. Instead I think we should use the usleep() function from the standard (from 5.8) Perl module Time::HiRes, as recommended in the Perl docs for the sleep() function for situations where you need finer grained timeouts. I have verified that this works on jacana and friends. Unless I hear objections I'll prepare a patch along those lines. cheers andrew -- Andrew Dunstan https://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jul 17, 2017 at 4:48 PM, Andrew Dunstan <andrew.dunstan@2ndquadrant.com> wrote: > This is a usage that is known not to work in Windows - IIRC we > eliminated such calls from our C programs at the time of the Windows > port - and it seems to me very likely to be the cause of the hang. > Instead I think we should use the usleep() function from the standard > (from 5.8) Perl module Time::HiRes, as recommended in the Perl docs for > the sleep() function for situations where you need finer grained > timeouts. I have verified that this works on jacana and friends. Looking at my boxes (Arch, Mac, Windows), Time::Hires looks to be part of the core set of packages, so there is visibly no real need to incorporate a check in configure.in. So +1 for doing as you suggest. -- Michael
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes: > I've been trying to get to the bottom of a nasty hang in buildfarm > member jacana when running the pg_ctl TAP test. This test used to work, > and was last known to work on June 22nd. > My attention has become focussed on this change in commit de3de0afd: > - # Wait a second before retrying. > - sleep 1; > + # Wait 0.1 second before retrying. > + select undef, undef, undef, 0.1; > This is a usage that is known not to work in Windows - IIRC we > eliminated such calls from our C programs at the time of the Windows > port - and it seems to me very likely to be the cause of the hang. Ugh. > Instead I think we should use the usleep() function from the standard > (from 5.8) Perl module Time::HiRes, as recommended in the Perl docs for > the sleep() function for situations where you need finer grained > timeouts. I have verified that this works on jacana and friends. > Unless I hear objections I'll prepare a patch along those lines. WFM. Thanks for taking care of it. regards, tom lane