BUG #14432: sslmode=allow causing authentication to time out

Started by Nonameover 9 years ago2 messagesbugs
Jump to latest
#1Noname
nunziotocci2000@gmail.com

The following bug has been logged on the website:

Bug reference: 14432
Logged by: Annunziato Tocci
Email address: nunziotocci2000@gmail.com
PostgreSQL version: 9.6.1
Operating system: Fedora 24, 64-bit
Description:

I have a project that exposes a database with an API, and my test suites
have a problem

I send 100 login requests to PostgreSQL, and 2-3 of them come back 2 minutes
later saying "server closed the connection unexpectedly", and the server
logs say "canceling authentication due to timeout".

It is reproducible on Linux systems, but I haven't yet tested 9.6 on macOS
or Windows 10.

I tracked it down to sslmode=allow. The below script reproduces the
problem.

# this only says "too many clients"
$ ./test.sh disable

# this says "too many clients", interspersed with "server closed the
connection unexpectedly" errors, but the logs don't say "canceling
authentication due to timeout"
$ ./test.sh allow

I believe it is the same underlying problem, but there are slightly
different errors as described above.

Begin script:
```
#!/bin/sh

rm -rf data_test

initdb -D data_test -E UTF8 -U postgres
echo "unix_socket_directories='/tmp'" >> data_test/postgresql.conf

pg_ctl start -D data_test -o "-p 5431"
sleep 3
psql -U postgres -h 127.0.0.1 -p 5431 postgres -1c "ALTER ROLE postgres
PASSWORD 'password';"
pg_ctl stop -D data_test

cp data_test/pg_hba.conf data_test/pg_hba.bak
sed -e '/trust/s/32 trust/32 md5/g' <
data_test/pg_hba.bak > data_test/pg_hba.conf
echo "host all all 192.168.0.0/16 md5"

data_test/pg_hba.conf

pg_ctl start -D data_test -o "-p 5431 -N 1000"
sleep 3
pids=""
echo "127.0.0.1:5431:postgres:postgres:password" > ./pgpass
chmod 0600 ./pgpass
export PGPASSFILE="./pgpass"
for i in `seq 1 10000`; do
#echo "loop $i"
psql "host=127.0.0.1 dbname=postgres port=5431 sslmode="$1" user=postgres"
-1c "SELECT 'test', pg_sleep(5);" > /dev/null &
pids="$pids $!"
done
wait $pids
pg_ctl stop -D data_test
```

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: BUG #14432: sslmode=allow causing authentication to time out

nunziotocci2000@gmail.com writes:

I send 100 login requests to PostgreSQL, and 2-3 of them come back 2 minutes
later saying "server closed the connection unexpectedly", and the server
logs say "canceling authentication due to timeout".

FWIW, I couldn't reproduce this (using RHEL6, don't have a Fedora
installation at the moment).

I tracked it down to sslmode=allow. The below script reproduces the
problem.

Since you haven't done anything to enable SSL in your test server,
sslmode=allow shouldn't have any effect except to allow libpq to
retry a failed connection attempt one time. libpq is pretty
simple-minded about that and will retry no matter what the specific
error report is, in particular it would do so for "too many clients".
So basically this ought to just increase the number of "too many clients"
failures you get. I wonder whether you are running into kernel
resource limits like number of processes or number of open files.
I was able to get some "fork failed: Resource temporarily unavailable"
type errors if I pushed max_connections high enough, but no unexpected
behavior.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs