Determining server load

Started by Israel Brewsterover 9 years ago16 messagesgeneral
Jump to latest
#1Israel Brewster
israel@ravnalaska.net

I have a Postgresql (9.4.6) cluster that hosts several databases, used by about half-a-dozen different in-house apps. I have two servers set up as master-slave with streaming replication. Lately I've been running into an issue where one of the apps periodically can't connect to the db. The problem is always extremely short lived (less than a minute), such that by the time I can look into it, there is no issue. My *suspicion* is that I am briefly hitting the max connection limit of my server (currently set at 100). If so, I can certainly *fix* the issue easily by increasing the connection limit, but I have two questions about this:

1) Is there some way I can track concurrent connections to see if my theory is correct? I know I can do a count(*) on pg_stat_activity to get the current number of connections at any point (currently at 45 BTW), but aside from repeatedly querying this at short intervals, which I am afraid would put undue load on the server by the time it is frequent enough to be of use, I don't know how to track concurrent connections.

I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those connections are probably very short lived that doesn't really tell me anything about concurrent connections.

2) Is increasing the connection limit even the "proper" fix for this, or am I at a load point where I need to start looking at tools like pgpool or something to distribute some of the load to my hot standby server? I do realize you may not be able to answer that directly, since I haven't given enough information about my server/hardware/load, etc, but answers that tell me how to better look at the load over time and figure out if I am overloaded are appreciated.

For reference, the server is running on the following hardware:

2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
32 GB Ram total, currently with 533144k showing as "free" and 370464k of swap used
371 GB SSD RAID 10 (currently only using 40GB of space)
Dual Gigabit ethernet

Thanks for any advice that can be provided!
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

Attachments:

Israel Brewster.vcftext/directory; name="Israel Brewster.vcf"Download
#2John R Pierce
pierce@hogranch.com
In reply to: Israel Brewster (#1)
Re: Determining server load

On 9/27/2016 9:54 AM, Israel Brewster wrote:

I did look at pgbadger, which tells me I have gotten as high as 62
connections/second, but given that most of those connections are
probably very short lived that doesn't really tell me anything about
concurrent connections.

Each connection requires a process fork of the database server, which is
very expensive. you might consider using a connection pool such as
pgbouncer, to maintain a fixed(dynamic) number of real database
connections, and have your apps connect/disconnect to this pool.
Obviously, you need a pool for each database, and your apps need to be
'stateless' and not make or rely on any session changes to the
connection so they don't interfere with each other. Doing this
correctly can make an huge performance improvement on the sort of apps
that do (connect, transaction, disconnect) a lot.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Israel Brewster (#1)
Re: Determining server load

On 09/27/2016 09:54 AM, Israel Brewster wrote:

I have a Postgresql (9.4.6) cluster that hosts several databases, used
by about half-a-dozen different in-house apps. I have two servers set up
as master-slave with streaming replication. Lately I've been running
into an issue where one of the apps periodically can't connect to the
db. The problem is always extremely short lived (less than a minute),
such that by the time I can look into it, there is no issue. My
*suspicion* is that I am briefly hitting the max connection limit of my
server (currently set at 100). If so, I can certainly *fix* the issue
easily by increasing the connection limit, but I have two questions
about this:

What does your Postgres log show around this time?

1) Is there some way I can track concurrent connections to see if my
theory is correct? I know I can do a count(*) on pg_stat_activity to get
the current number of connections at any point (currently at 45 BTW),
but aside from repeatedly querying this at short intervals, which I am
afraid would put undue load on the server by the time it is frequent
enough to be of use, I don't know how to track concurrent connections.

I did look at pgbadger, which tells me I have gotten as high as 62
connections/second, but given that most of those connections are
probably very short lived that doesn't really tell me anything about
concurrent connections.

2) Is increasing the connection limit even the "proper" fix for this, or
am I at a load point where I need to start looking at tools like pgpool
or something to distribute some of the load to my hot standby server? I
do realize you may not be able to answer that directly, since I haven't
given enough information about my server/hardware/load, etc, but answers
that tell me how to better look at the load over time and figure out if
I am overloaded are appreciated.

For reference, the server is running on the following hardware:

2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
32 GB Ram total, currently with 533144k showing as "free" and 370464k of
swap used
371 GB SSD RAID 10 (currently only using 40GB of space)
Dual Gigabit ethernet

Thanks for any advice that can be provided!
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Israel Brewster
israel@ravnalaska.net
In reply to: Adrian Klaver (#3)
Re: Determining server load

On Sep 27, 2016, at 10:07 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 09/27/2016 09:54 AM, Israel Brewster wrote:

I have a Postgresql (9.4.6) cluster that hosts several databases, used
by about half-a-dozen different in-house apps. I have two servers set up
as master-slave with streaming replication. Lately I've been running
into an issue where one of the apps periodically can't connect to the
db. The problem is always extremely short lived (less than a minute),
such that by the time I can look into it, there is no issue. My
*suspicion* is that I am briefly hitting the max connection limit of my
server (currently set at 100). If so, I can certainly *fix* the issue
easily by increasing the connection limit, but I have two questions
about this:

What does your Postgres log show around this time?

So in looking further, I realized the actual error I was getting was "no route to host", which is obviously a networking issue and not a postgres issue - could not connect was only the end result. The logs then, of course, show normal operation. That said, now that I am thinking about it, I'm still curious as to how I can track concurrent connections, with the revised goal of simply seeing how heavily loaded my server really is, and when tools such as pgpool or the pgbouncer that another user mentioned start making sense for the number of connections I am dealing with. Thanks.
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

1) Is there some way I can track concurrent connections to see if my
theory is correct? I know I can do a count(*) on pg_stat_activity to get
the current number of connections at any point (currently at 45 BTW),
but aside from repeatedly querying this at short intervals, which I am
afraid would put undue load on the server by the time it is frequent
enough to be of use, I don't know how to track concurrent connections.

I did look at pgbadger, which tells me I have gotten as high as 62
connections/second, but given that most of those connections are
probably very short lived that doesn't really tell me anything about
concurrent connections.

2) Is increasing the connection limit even the "proper" fix for this, or
am I at a load point where I need to start looking at tools like pgpool
or something to distribute some of the load to my hot standby server? I
do realize you may not be able to answer that directly, since I haven't
given enough information about my server/hardware/load, etc, but answers
that tell me how to better look at the load over time and figure out if
I am overloaded are appreciated.

For reference, the server is running on the following hardware:

2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower there)
32 GB Ram total, currently with 533144k showing as "free" and 370464k of
swap used
371 GB SSD RAID 10 (currently only using 40GB of space)
Dual Gigabit ethernet

Thanks for any advice that can be provided!
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Melvin Davidson
melvin6925@gmail.com
In reply to: Israel Brewster (#4)
Re: Determining server load

On Tue, Sep 27, 2016 at 2:25 PM, Israel Brewster <israel@ravnalaska.net>
wrote:

On Sep 27, 2016, at 10:07 AM, Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

On 09/27/2016 09:54 AM, Israel Brewster wrote:

I have a Postgresql (9.4.6) cluster that hosts several databases, used
by about half-a-dozen different in-house apps. I have two servers set up
as master-slave with streaming replication. Lately I've been running
into an issue where one of the apps periodically can't connect to the
db. The problem is always extremely short lived (less than a minute),
such that by the time I can look into it, there is no issue. My
*suspicion* is that I am briefly hitting the max connection limit of my
server (currently set at 100). If so, I can certainly *fix* the issue
easily by increasing the connection limit, but I have two questions
about this:

What does your Postgres log show around this time?

So in looking further, I realized the actual error I was getting was "no
route to host", which is obviously a networking issue and not a postgres
issue - could not connect was only the end result. The logs then, of
course, show normal operation. That said, now that I am thinking about it,
I'm still curious as to how I can track concurrent connections, with the
revised goal of simply seeing how heavily loaded my server really is, and
when tools such as pgpool or the pgbouncer that another user mentioned
start making sense for the number of connections I am dealing with. Thanks.
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

1) Is there some way I can track concurrent connections to see if my
theory is correct? I know I can do a count(*) on pg_stat_activity to get
the current number of connections at any point (currently at 45 BTW),
but aside from repeatedly querying this at short intervals, which I am
afraid would put undue load on the server by the time it is frequent
enough to be of use, I don't know how to track concurrent connections.

I did look at pgbadger, which tells me I have gotten as high as 62
connections/second, but given that most of those connections are
probably very short lived that doesn't really tell me anything about
concurrent connections.

2) Is increasing the connection limit even the "proper" fix for this, or
am I at a load point where I need to start looking at tools like pgpool
or something to distribute some of the load to my hot standby server? I
do realize you may not be able to answer that directly, since I haven't
given enough information about my server/hardware/load, etc, but answers
that tell me how to better look at the load over time and figure out if
I am overloaded are appreciated.

For reference, the server is running on the following hardware:

2x 8-core Xeon E5-2630 v3 2.4 GHz processor (so plenty of horsepower

there)

32 GB Ram total, currently with 533144k showing as "free" and 370464k of
swap used
371 GB SSD RAID 10 (currently only using 40GB of space)
Dual Gigabit ethernet

Thanks for any advice that can be provided!
-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you
extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#6Israel Brewster
israel@ravnalaska.net
In reply to: John R Pierce (#2)
Re: Determining server load

On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:

On 9/27/2016 9:54 AM, Israel Brewster wrote:

I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those connections are probably very short lived that doesn't really tell me anything about concurrent connections.

Each connection requires a process fork of the database server, which is very expensive. you might consider using a connection pool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your apps connect/disconnect to this pool. Obviously, you need a pool for each database, and your apps need to be 'stateless' and not make or rely on any session changes to the connection so they don't interfere with each other. Doing this correctly can make an huge performance improvement on the sort of apps that do (connect, transaction, disconnect) a lot.

Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's psycopg2 pool, to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling connections in the background - I'm not 100% certain how the pools there work (and maybe they need some tweaking as well, i.e. setting to re-use connections more times or something). The apps that don't use pools are typically data-gathering scripts where it doesn't mater how long it takes to connect/write the data (within reason).

That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all those connections starts slowing things down, and not just for the new connection being made. How to figure out where that point is for my system, and how close to it I am at the moment, is a large part of what I am wondering.

Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue, not a postgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Israel Brewster
israel@ravnalaska.net
In reply to: Melvin Davidson (#5)
Re: Determining server load

I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

Show quoted text

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Israel Brewster (#6)
Re: Determining server load

On 09/27/2016 11:40 AM, Israel Brewster wrote:

On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:

On 9/27/2016 9:54 AM, Israel Brewster wrote:

I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those connections are probably very short lived that doesn't really tell me anything about concurrent connections.

Each connection requires a process fork of the database server, which is very expensive. you might consider using a connection pool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your apps connect/disconnect to this pool. Obviously, you need a pool for each database, and your apps need to be 'stateless' and not make or rely on any session changes to the connection so they don't interfere with each other. Doing this correctly can make an huge performance improvement on the sort of apps that do (connect, transaction, disconnect) a lot.

Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's psycopg2 pool, to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling connections in the background - I'm not 100% certain how the pools there work (and maybe they need some tweaking as well, i.e. setting to re-use connections more times or something). The apps that don't use pools are typically data-gathering scripts where it doesn't mater how long it takes to connect/write the data (within reason).

http://initd.org/psycopg/docs/pool.html

"Note

This pool class is mostly designed to interact with Zope and probably
not useful in generic applications. "

Are you using Zope?

That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all those connections starts slowing things down, and not just for the new connection being made. How to figure out where that point is for my system, and how close to it I am at the moment, is a large part of what I am wondering.

Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue, not a postgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Melvin Davidson
melvin6925@gmail.com
In reply to: Israel Brewster (#7)
Re: Determining server load

On Tue, Sep 27, 2016 at 2:46 PM, Israel Brewster <israel@ravnalaska.net>
wrote:

I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow
you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

I do have those on, and I could write a parser that scans through the logs
counting connections and disconnections to give a number of current
connections at any given time. Trying to make it operate "in real time"
would be interesting, though, as PG logs into different files by
day-of-the-week (at least, with the settings I have), rather than into a
single file that gets rotated out. I was kind of hoping such a tool, such
as pgbadger (which, unfortunately, only seems to track connections per
second and not consecutive connections), already existed, or that there was
some way to have the database itself track this metric. If not, well, I
guess that's another project :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

*Does this help?*

*--Total concurrent connections*

*SELECT COUNT(*) FROM pg_stat_activity;--concurrent connections by
userSELECT usename, count(*) FROM pg_stat_activityGROUP BY 1ORDER BY
1;--concurrent connections by databaseSELECT datname, usename,
count(*) FROM pg_stat_activityGROUP BY 1, 2ORDER BY 1, 2;*

*-- database connections by user*

*SELECT usename, datname, count(*) FROM pg_stat_activityGROUP
BY 1, 2ORDER BY 1, 2;-- *
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#10Jonathan Vanasco
postgres@2xlp.com
In reply to: Israel Brewster (#7)
Re: Determining server load

On Sep 27, 2016, at 2:46 PM, Israel Brewster wrote:

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

There are a lot of postgres configs and server specific tools... but on the application side and for general debugging, have you looked at statsd ? https://github.com/etsy/statsd

it's a lightweight node.js app that runs on your server and listens for UDP signals, which your apps can emit for counting or timing. We have a ton of Python apps logging to it, including every postgres connection open/close and error. The overhead of clients and server is negligible. When combined with the graphite app for browsing data via charts, it becomes really useful at detecting issues with load or errors stemming from a deployment -- you just look for spikes and cliffs. We even use it to log the volume of INSERTS vs SELECTS vs UPDATES being sent to postgres.

The more services/apps you run, the more useful it gets, as you can figure out which apps/deployments are screwing up postgres and the exact moment things went wrong.

#11Israel Brewster
israel@ravnalaska.net
In reply to: Adrian Klaver (#8)
Re: Determining server load

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

Attachments:

Israel Brewster.vcftext/directory; name="Israel Brewster.vcf"Download
#12Israel Brewster
israel@ravnalaska.net
In reply to: Jonathan Vanasco (#10)
Re: Determining server load

On Sep 27, 2016, at 10:55 AM, Jonathan Vanasco <postgres@2xlp.com> wrote:

On Sep 27, 2016, at 2:46 PM, Israel Brewster wrote:

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

There are a lot of postgres configs and server specific tools... but on the application side and for general debugging, have you looked at statsd ? https://github.com/etsy/statsd <https://github.com/etsy/statsd&gt;

it's a lightweight node.js app that runs on your server and listens for UDP signals, which your apps can emit for counting or timing. We have a ton of Python apps logging to it, including every postgres connection open/close and error. The overhead of clients and server is negligible. When combined with the graphite app for browsing data via charts, it becomes really useful at detecting issues with load or errors stemming from a deployment -- you just look for spikes and cliffs. We even use it to log the volume of INSERTS vs SELECTS vs UPDATES being sent to postgres.

The more services/apps you run, the more useful it gets, as you can figure out which apps/deployments are screwing up postgres and the exact moment things went wrong.

That sounds quite promising. I'll look into it. Thanks!

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

#13Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Israel Brewster (#11)
Re: Determining server load

On 09/27/2016 12:01 PM, Israel Brewster wrote:

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

On Sep 27, 2016, at 10:48 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 09/27/2016 11:40 AM, Israel Brewster wrote:

On Sep 27, 2016, at 9:55 AM, John R Pierce <pierce@hogranch.com> wrote:

On 9/27/2016 9:54 AM, Israel Brewster wrote:

I did look at pgbadger, which tells me I have gotten as high as 62 connections/second, but given that most of those connections are probably very short lived that doesn't really tell me anything about concurrent connections.

Each connection requires a process fork of the database server, which is very expensive. you might consider using a connection pool such as pgbouncer, to maintain a fixed(dynamic) number of real database connections, and have your apps connect/disconnect to this pool. Obviously, you need a pool for each database, and your apps need to be 'stateless' and not make or rely on any session changes to the connection so they don't interfere with each other. Doing this correctly can make an huge performance improvement on the sort of apps that do (connect, transaction, disconnect) a lot.

Understood. My main *performance critical* apps all use an internal connection pool for this reason - Python's psycopg2 pool, to be exact. I still see a lot of connects/disconnects, but I *think* that's psycopg2 recycling connections in the background - I'm not 100% certain how the pools there work (and maybe they need some tweaking as well, i.e. setting to re-use connections more times or something). The apps that don't use pools are typically data-gathering scripts where it doesn't mater how long it takes to connect/write the data (within reason).

http://initd.org/psycopg/docs/pool.html

"Note

This pool class is mostly designed to interact with Zope and probably not useful in generic applications. "

Are you using Zope?

You'll notice that note only applies to the PersistentConnectionPool, not the ThreadedConnectionPool (Which has a note saying that it can be safely used in multi-threaded applications), or the SimpleConnectionPool (which is useful only for single-threaded applications). Since I'm not using Zope, and do have multi-threaded applications, I'm naturally using the ThreadedConnectionPool :-)

Oops, did not catch that.

That said, it seems highly probable, if not a given, that there comes a point where the overhead of handling all those connections starts slowing things down, and not just for the new connection being made. How to figure out where that point is for my system, and how close to it I am at the moment, is a large part of what I am wondering.

Note also that I did realize I was completely wrong about the initial issue - it turned out it was a network issue, not a postgresql one. Still, I think my specific questions still apply, if only in an academic sense now :-)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14Israel Brewster
israel@ravnalaska.net
In reply to: Melvin Davidson (#9)
Re: Determining server load

On Sep 27, 2016, at 10:55 AM, Melvin Davidson <melvin6925@gmail.com> wrote:

On Tue, Sep 27, 2016 at 2:46 PM, Israel Brewster <israel@ravnalaska.net <mailto:israel@ravnalaska.net>> wrote:

I'm still curious as to how I can track concurrent connections, ...

Have you considered enabling the following in postgresql.conf?
log_connections=on
log_disconnections=on

It will put a bit of a bloat in you postgres log, but it will all allow you extract connects/disconnects over a time range. That should allow you
to determine concurrent connections during that that.

I do have those on, and I could write a parser that scans through the logs counting connections and disconnections to give a number of current connections at any given time. Trying to make it operate "in real time" would be interesting, though, as PG logs into different files by day-of-the-week (at least, with the settings I have), rather than into a single file that gets rotated out. I was kind of hoping such a tool, such as pgbadger (which, unfortunately, only seems to track connections per second and not consecutive connections), already existed, or that there was some way to have the database itself track this metric. If not, well, I guess that's another project :)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293 <tel:%28907%29%20450-7293>
-----------------------------------------------

Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Does this help?

--Total concurrent connections
SELECT COUNT(*)
FROM pg_stat_activity;

--concurrent connections by user
SELECT usename,
count(*)
FROM pg_stat_activity
GROUP BY 1
ORDER BY 1;

--concurrent connections by database
SELECT datname,
usename,
count(*)
FROM pg_stat_activity
GROUP BY 1, 2
ORDER BY 1, 2;

-- database connections by user
SELECT usename,
datname,
count(*)
FROM pg_stat_activity
GROUP BY 1, 2
ORDER BY 1, 2;

--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

That helps for one-time stat collection, but as I mentioned in my original message, since connections may not last long, I could be getting close to, or even hitting, my connection limit while still getting values back from those that show plenty of connections remaining, depending on how often I checked.

I guess what would be ideal in my mind is that whenever Postgresql logged an opened/closed connection, it also looked the *total* number of open connections at that time. I don't think that's possible, however :-)

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

#15John R Pierce
pierce@hogranch.com
In reply to: Israel Brewster (#14)
Re: Determining server load

On 9/27/2016 12:06 PM, Israel Brewster wrote:

That helps for one-time stat collection, but as I mentioned in my
original message, since connections may not last long, I could be
getting close to, or even hitting, my connection limit while still
getting values back from those that show plenty of connections
remaining, depending on how often I checked.

I guess what would be ideal in my mind is that whenever Postgresql
logged an opened/closed connection, it also looked the *total* number
of open connections at that time. I don't think that's possible,
however :-)

if you stick pgbouncer in front of postgres (with a pool for each
user@database), I believe you CAN track the max connections via
pgbouncer's pool stats.

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#16Israel Brewster
israel@ravnalaska.net
In reply to: John R Pierce (#15)
Re: Determining server load

On Sep 27, 2016, at 11:16 AM, John R Pierce <pierce@hogranch.com> wrote:

On 9/27/2016 12:06 PM, Israel Brewster wrote:

That helps for one-time stat collection, but as I mentioned in my original message, since connections may not last long, I could be getting close to, or even hitting, my connection limit while still getting values back from those that show plenty of connections remaining, depending on how often I checked.

I guess what would be ideal in my mind is that whenever Postgresql logged an opened/closed connection, it also looked the *total* number of open connections at that time. I don't think that's possible, however :-)

if you stick pgbouncer in front of postgres (with a pool for each user@database), I believe you CAN track the max connections via pgbouncer's pool stats.

Ahh! If so, that alone would be reason enough for using pgbouncer. Thanks!

-----------------------------------------------
Israel Brewster
Systems Analyst II
Ravn Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7293
-----------------------------------------------

--
john r pierce, recycling bits in santa cruz

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general