Problems with max_connections parameter
Hi all,
I am a master student on database systems and I'm working on a stress
testing methodology.
The goal is to stress testing PostgreSQL under different workloads.
Thus, I would like to discuss with you some results.
My methodology is based on the increment of two variables: complexity and
workload. The complexity is the setup of the testing environment and the
SUT. That is the hardware and software setup. The workload is the number of
transactions submitted to the SUT.
The test case increases the number of transactions to find the limit of
system.
I had some problems with the parameter max_connections.
By default the value of this parameter is 100.
I set its value to 2,000, accordingly to the amount of available memory. I
used the following formula to set the operating system parameter SHMMAX:
SHMMAX=250kB + 8.2kB * shared_buffers + 14.2 kB * max_connections
The database has started correctly.
I begun the tests using 5 clients machines and 1 server. In the server side,
I just ran Postgres. In the clients side, I used threads to simulate a large
number of transactions.
I simulated 1000, 10000 and 100000 transactions sequentially.
I used the transaction of TPC-B, this benchmark simulates a bank system. It
have insert, update and select of customers accounts.
At the first test, 1000 transactions were submitted to the database at 2
seconds. The time interval for analysis is divided into seconds. The
distribution of transactions per time interval is uneven due to delay client
boot. In the first interval 200 transactions were submitted and all were
attended and completed successfully.
In sub-sequent intevalo 800 transactions were submitted and of these 33 were
not started due to connection refusal by the BD.
This rate is not expected since the DB was configured to serve 2000
concurrent connections.
An important fact that deserves to be emphasized is that there were no cases
of aborted transactions, i.e. all transactions submitted and accepted by BD
were successfully completed with average execution time equal to or less
than 1 second.
The worst case happens with the 100000 transactions. In this case the
successful rate is near to 800 per second, but the "connection errors"
reached 10000 per second.
The configured limit to max_connections was not reached at any time.
This is a bug?
Do I make myself clear?
Best regards
Jorge Augusto Meira
Jorge Augusto Meira escreveu:
This is a bug?
You don't provide sufficient information. PostgreSQL version? Error message?
Talking about your problem, are you sure you're not reaching max_connections?
Did you check the logs?
--
Euler Taveira de Oliveira
http://www.timbira.com/
Euler Taveira de Oliveira <euler@timbira.com> wrote:
Talking about your problem, are you sure you're not reaching
max_connections?
It also strikes me that from the minimal information given, it might
be possible that pid numbers or port numbers are wrapping around
before the OS is ready to allow re-use. I haven't seen that
behavior except in machines infected with a worm, but this test
might be edging into the same territory if it's using a new
connection for each request. Obviously, nobody who cared about
performance would use that technique in production, but it rather
sounds like this test does.
-Kevin
Hi
Thanks for the answers.
Really, I didn't showed you basic informations.
I used the PostgreSQL 8.4. The server configuration was:
Processor: Intel Xeon Processor W3570 Quad Core Processor
Mem: 20GB
Network Interface: Gigabit
HD: 12 x 1 TB (RAID1+0)
OS: Debian GNU/Linux, kernel 2.6.26-2-64
The client machines has similar configuration.
The error message was:
"Erro Conexão: A tentativa de conexão falhou."
or
"Erro Conexão: FATAL: connection limit exceeded for non-superusers"
I used the logs of my java application used to simulate the clients.
Thanks
Jorge
On Fri, Dec 3, 2010 at 12:54 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov
Show quoted text
wrote:
Euler Taveira de Oliveira <euler@timbira.com> wrote:
Talking about your problem, are you sure you're not reaching
max_connections?It also strikes me that from the minimal information given, it might
be possible that pid numbers or port numbers are wrapping around
before the OS is ready to allow re-use. I haven't seen that
behavior except in machines infected with a worm, but this test
might be edging into the same territory if it's using a new
connection for each request. Obviously, nobody who cared about
performance would use that technique in production, but it rather
sounds like this test does.-Kevin
Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes:
The error message was:
"Erro Conex�o: A tentativa de conex�o falhou."
or
"Erro Conex�o: FATAL: connection limit exceeded for non-superusers"
Hmm ... I can't find the first of those anywhere in the 8.4 message
lists; but the second one definitely says that you *are* hitting the
max_connections limit, whether you think you should be or not.
I wonder whether you are neglecting to allow for the fact that backends
have a nonzero shutdown time? If you disconnect and immediately
reconnect, it's possible that your old backend is still around, so that
the new connection attempt causes max_connections to be exceeded. This
is particularly likely if the test program is on the same machine as the
database server, because the test program itself is likely to have a
higher scheduling priority than the old backend.
regards, tom lane
Hi Tom
The test program is running in other 5 client machines.
In the logs of my test program, the max_connection parameter limit is never
reached.
Regards
Jorge
On Mon, Dec 6, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes:
The error message was:
"Erro Conexão: A tentativa de conexão falhou."
or
"Erro Conexão: FATAL: connection limit exceeded for non-superusers"Hmm ... I can't find the first of those anywhere in the 8.4 message
lists; but the second one definitely says that you *are* hitting the
max_connections limit, whether you think you should be or not.I wonder whether you are neglecting to allow for the fact that backends
have a nonzero shutdown time? If you disconnect and immediately
reconnect, it's possible that your old backend is still around, so that
the new connection attempt causes max_connections to be exceeded. This
is particularly likely if the test program is on the same machine as the
database server, because the test program itself is likely to have a
higher scheduling priority than the old backend.regards, tom lane
Jorge Augusto Meira escreveu:
The test program is running in other 5 client machines.
In the logs of my test program, the max_connection parameter limit is
never reached.
How could the test program know? Indeed it doesn't. Are you using some delay
between one test and another one? I would be a good idea, specially if you're
restarting the PostgreSQL between tests.
--
Euler Taveira de Oliveira
http://www.timbira.com/
Hi, Euler
- How could the test program know?
The test program after any operation (inser, tupdate or select) receive a
message of postgres like OK or ERROR (Connection error: FATAL).
- Are you using some delay between one test and another one? I would be a
good idea, specially if you're restarting the PostgreSQL between tests.
Yes, I used a delay. After any test I restarted the postgres and cleared my
database.
Regards
Jorge
On Mon, Dec 6, 2010 at 4:26 PM, Euler Taveira de Oliveira <euler@timbira.com
Show quoted text
wrote:
Jorge Augusto Meira escreveu:
The test program is running in other 5 client machines.
In the logs of my test program, the max_connection parameter limit is
never reached.How could the test program know? Indeed it doesn't. Are you using some
delay
between one test and another one? I would be a good idea, specially if
you're
restarting the PostgreSQL between tests.--
Euler Taveira de Oliveira
http://www.timbira.com/--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Hi again
Have something else I can do to reach the limit of the parameter
max_connections?
This may be a bug?
Thanks
Jorge
Show quoted text
On Mon, Dec 6, 2010 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Jorge Augusto Meira <jmeira@c3sl.ufpr.br> writes:
The error message was:
"Erro Conexão: A tentativa de conexão falhou."
or
"Erro Conexão: FATAL: connection limit exceeded for non-superusers"Hmm ... I can't find the first of those anywhere in the 8.4 message
lists; but the second one definitely says that you *are* hitting the
max_connections limit, whether you think you should be or not.I wonder whether you are neglecting to allow for the fact that backends
have a nonzero shutdown time? If you disconnect and immediately
reconnect, it's possible that your old backend is still around, so that
the new connection attempt causes max_connections to be exceeded. This
is particularly likely if the test program is on the same machine as the
database server, because the test program itself is likely to have a
higher scheduling priority than the old backend.regards, tom lane
On Mon, Dec 13, 2010 at 8:46 PM, Jorge Augusto Meira
<jmeira@c3sl.ufpr.br> wrote:
Have something else I can do to reach the limit of the parameter
max_connections?This may be a bug?
Well, I don't think you've really identified what's happening. Kevin
and Tom both suggested possible explanations upthread.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company