On the stability of TAP tests for LDAP
Hi all,
(Peter Eisentraut in CC)
crake has just complained about a failure with the LDAP test suite:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2019-07-19%2001%3A33%3A31
# Running: /usr/sbin/slapd -f
/home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/slapd.conf
-h ldap://localhost:55306 ldaps://localhost:55307
# loading LDAP data
# Running: ldapadd -x -y
/home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/ldappassword
-f authdata.ldif
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
And FWIW, when running a parallel check-world to make my laptop busy
enough, it is rather usual to face failures with this test, which is
annoying. Shouldn't we have at least a number of retries with
intermediate sleeps for the commands run in 001_auth.pl? As far as I
recall, I think that we can run into failures when calling ldapadd and
ldappasswd.
Thanks,
--
Michael
On Fri, Jul 19, 2019 at 3:30 PM Michael Paquier <michael@paquier.xyz> wrote:
# Running: /usr/sbin/slapd -f
/home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/slapd.conf
-h ldap://localhost:55306 ldaps://localhost:55307
# loading LDAP data
# Running: ldapadd -x -y
/home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/ldappassword
-f authdata.ldif
ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)And FWIW, when running a parallel check-world to make my laptop busy
enough, it is rather usual to face failures with this test, which is
annoying. Shouldn't we have at least a number of retries with
intermediate sleeps for the commands run in 001_auth.pl? As far as I
recall, I think that we can run into failures when calling ldapadd and
ldappasswd.
Yeah, it seems we need to figure out a way to wait for it to be ready
to accept connections. I wondered how other people do this, and found
one example that polls for the .pid file:
https://github.com/tiredofit/docker-openldap/blob/master/install/etc/cont-init.d/10-openldap#L347
That looks nice and tidy but I'm not sure it can be trusted, given
that OpenLDAP's own tests poll a trivial ldapsearch (also based on a
cursory glance at slapd/main.c which appears to write the .pid file a
bit too early, though I may have misread):
https://github.com/openldap/openldap/blob/master/tests/scripts/test039-glue-ldap-concurrency#L59
I guess we should do that too. I don't know how to write Perl but I'll try...
--
Thomas Munro
https://enterprisedb.com
On Wed, Jul 24, 2019 at 3:52 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I guess we should do that too. I don't know how to write Perl but I'll try...
Does this look about right?
--
Thomas Munro
https://enterprisedb.com
Attachments:
wait-for-slapd.patchapplication/octet-stream; name=wait-for-slapd.patchDownload+9-0
On Wed, Jul 24, 2019 at 04:41:05PM +1200, Thomas Munro wrote:
On Wed, Jul 24, 2019 at 3:52 PM Thomas Munro <thomas.munro@gmail.com> wrote:
I guess we should do that too. I don't know how to write Perl but I'll try...
Does this look about right?
Some comments from here. I have not tested the patch.
I would recommend using TestLib::system_log instead of plain system().
The command should be a list of arguments with one element per
argument (see call of system_log in PostgresNode.pm for example). The
indentation is incorrect, and that I would make the retry longer as I
got the feeling that on slow machines we could still have issues. We
also usually tend to increase the timeout up to 5 minutes, and the
sleep phases make use of Time::HiRes::usleep.
--
Michael
On Wed, Jul 24, 2019 at 5:26 PM Michael Paquier <michael@paquier.xyz> wrote:
Does this look about right?
Some comments from here. I have not tested the patch.
I would recommend using TestLib::system_log instead of plain system().
The command should be a list of arguments with one element per
argument (see call of system_log in PostgresNode.pm for example). The
indentation is incorrect, and that I would make the retry longer as I
got the feeling that on slow machines we could still have issues. We
also usually tend to increase the timeout up to 5 minutes, and the
sleep phases make use of Time::HiRes::usleep.
Thanks, here's v2.
--
Thomas Munro
https://enterprisedb.com
Attachments:
wait-for-slapd-v2.patchapplication/octet-stream; name=wait-for-slapd-v2.patchDownload+11-0
On Wed, Jul 24, 2019 at 05:47:13PM +1200, Thomas Munro wrote:
Thanks, here's v2.
Perhaps this worked on freebsd? Now that I test it, the test gets
stuck on my Debian box:
# waiting for slapd to accept requests...
# Running: ldapsearch -h localhost -p 49534 -s base -b
dc=example,dc=net -n 'objectclass=*'
SASL/DIGEST-MD5 authentication started
Please enter your password:
ldap_sasl_interactive_bind_s: Invalid credentials (49)
additional info: SASL(-13): user not found: no secret in
database
pgperltidy complains about the patch indentation using perltidy
v20170521 (version mentioned in tools/pgindent/README).
--
Michael
On Wed, Jul 24, 2019 at 7:50 PM Michael Paquier <michael@paquier.xyz> wrote:
Perhaps this worked on freebsd? Now that I test it, the test gets
stuck on my Debian box:
# waiting for slapd to accept requests...
# Running: ldapsearch -h localhost -p 49534 -s base -b
dc=example,dc=net -n 'objectclass=*'
SASL/DIGEST-MD5 authentication started
Please enter your password:
ldap_sasl_interactive_bind_s: Invalid credentials (49)
additional info: SASL(-13): user not found: no secret in
database
Huh, yeah, I don't know why slapd requires credentials on Debian, when
the version that ships with FreeBSD is OK with an anonymous
connection. Rather than worrying about that, I just adjusted it to
supply the credentials. It works on both for me.
pgperltidy complains about the patch indentation using perltidy
v20170521 (version mentioned in tools/pgindent/README).
Fixed.
--
Thomas Munro
https://enterprisedb.com
Attachments:
wait-for-slapd-v3.patchapplication/octet-stream; name=wait-for-slapd-v3.patchDownload+16-0
On Wed, Jul 24, 2019 at 09:01:47PM +1200, Thomas Munro wrote:
Huh, yeah, I don't know why slapd requires credentials on Debian, when
the version that ships with FreeBSD is OK with an anonymous
connection. Rather than worrying about that, I just adjusted it to
supply the credentials. It works on both for me.
Thanks for the updated patch, this looks good. I have done a series
of tests keeping my laptop busy and I haven't seen a failure where I
usually see problems 10%~20% of the time. So that seems to help,
thanks!
--
Michael
On Thu, Jul 25, 2019 at 12:51 PM Michael Paquier <michael@paquier.xyz> wrote:
Thanks for the updated patch, this looks good. I have done a series
of tests keeping my laptop busy and I haven't seen a failure where I
usually see problems 10%~20% of the time. So that seems to help,
thanks!
Pushed, thanks.
--
Thomas Munro
https://enterprisedb.com