PG version recommendation

Started by David Gauthieralmost 7 years ago18 messagesgeneral
Jump to latest
#1David Gauthier
davegauthierpg@gmail.com

Hi:

I'm going to be requesting a PG instance supported by an IT team in a large
corp. They will be creating the server as a VM. We will be loading the DB
using scripts (perl/dbi) on linux, possibly using bulk loading techniques
if that's required. Queries will come from both linux and the web, but
typically the number of concurrent users will be on the order of 10 reads,
maybe a couple writers. < 1T total disk, no partitioning. I will be
requesting PITR.

I need to pick a PG version in my request. I want something that will be
stable and reliable while, of course, being able to perform well. What
would be a good choice for PG version ?

Also, since the server will be a VM, are there any special
recommendations/suggestions might I forward in the request (install
options, tuning options, other) ?

Thanks !

#2Adrian Klaver
adrian.klaver@aklaver.com
In reply to: David Gauthier (#1)
Re: PG version recommendation

On 5/7/19 11:52 AM, David Gauthier wrote:

Hi:

I'm going to be requesting a PG instance supported by an IT team in a
large corp.  They will be creating the server as a VM.  We will be
loading the DB using scripts (perl/dbi) on linux, possibly using bulk
loading techniques if that's required.  Queries will come from both
linux and the web, but typically the number of concurrent users will be
on the order of 10 reads, maybe a couple writers.  < 1T total disk, no
partitioning.  I will be requesting PITR.

I need to pick a PG version in my request.  I want something that will
be stable and reliable while, of course, being able to perform well.
What would be a good choice for PG version ?

The latest production release v 11. You will have community support
until November 2023:

https://www.postgresql.org/support/versioning/

As will all suggestions test under your conditions to verify.

Also, since the server will be a VM, are there any special
recommendations/suggestions might I forward in the request (install
options, tuning options, other) ?

Thanks !

--
Adrian Klaver
adrian.klaver@aklaver.com

#3Thomas Kellerer
spam_eater@gmx.net
In reply to: David Gauthier (#1)
Re: PG version recommendation

David Gauthier schrieb am 07.05.2019 um 20:52:

I need to pick a PG version in my request. I want something that
will be stable and reliable while, of course, being able to perform
well. What would be a good choice for PG version?

There is no reason to not choose the latest version.
So I would suggest to pick Postgres 11

#4Ron
ronljohnsonjr@gmail.com
In reply to: David Gauthier (#1)
Re: PG version recommendation

On 5/7/19 1:52 PM, David Gauthier wrote:

Hi:

I'm going to be requesting a PG instance supported by an IT team in a
large corp.  They will be creating the server as a VM.  We will be loading
the DB using scripts (perl/dbi) on linux, possibly using bulk loading
techniques if that's required.  Queries will come from both linux and the
web, but typically the number of concurrent users will be on the order of
10 reads, maybe a couple writers.  < 1T total disk, no partitioning. I
will be requesting PITR.

I need to pick a PG version in my request.  I want something that will be
stable and reliable while, of course, being able to perform well. What
would be a good choice for PG version ?

Also, since the server will be a VM, are there any special
recommendations/suggestions might I forward in the request (install
options, tuning options, other) ?

Home-rolled application, or third party?  If third party, they'll have
something to say about supported version numbers.

--
Angular momentum makes the world go 'round.

#5David Gauthier
davegauthierpg@gmail.com
In reply to: Ron (#4)
Re: PG version recommendation

Home-rolled application, or third party?

Are you asking about how they do VMs ?
They already provide PG v9.6.7 , so I gather they're not averse to
supporting PG DBs.

On Tue, May 7, 2019 at 4:05 PM Ron <ronljohnsonjr@gmail.com> wrote:

Show quoted text

On 5/7/19 1:52 PM, David Gauthier wrote:

Hi:

I'm going to be requesting a PG instance supported by an IT team in a
large corp. They will be creating the server as a VM. We will be loading
the DB using scripts (perl/dbi) on linux, possibly using bulk loading
techniques if that's required. Queries will come from both linux and the
web, but typically the number of concurrent users will be on the order of
10 reads, maybe a couple writers. < 1T total disk, no partitioning. I
will be requesting PITR.

I need to pick a PG version in my request. I want something that will be
stable and reliable while, of course, being able to perform well. What
would be a good choice for PG version ?

Also, since the server will be a VM, are there any special
recommendations/suggestions might I forward in the request (install
options, tuning options, other) ?

Home-rolled application, or third party? If third party, they'll have
something to say about supported version numbers.

--
Angular momentum makes the world go 'round.

#6Tim Cross
theophilusx@gmail.com
In reply to: David Gauthier (#1)
Re: PG version recommendation

I would find out if the IT team who will maintain the system are running
a specific Linux distribution, such as RHEL and just go with the PG
version that is on that distribution.

Large corp rarely have sufficient IT resources. Unless you specifically
need a particular PG version (which does not seem to be the case based
on your info), you are better off sticking with the version provided by
whatever distro they use. This will ensure reasonable updates and
patches. In corp environments, where IT resources are thin on the
ground, any custom install often results in poor patching and update
cycles because it falls outside 'standard procedures'.

With respect to hardware specifications, it really depends a lot on what
the infrastructure is. Typically, you would be better off specifying the
performance and storage (size) you require and leave it to the IT team
to work out how to best satisfy those requirements e.g. support x
concurrent connections, y Tb/Gb of storage, backup requirements i.e. how
frequent, how many versions and retention requirements. Include details
of any additional PG packages you may need/want and how/where the
database will need to be accessed from.

As you indicate the host will be a VM, it should be relatively easy to
scale up/down cpus or memory as required, unless you have special
requirements (very complex queries, very large data sets, complex data
models involving GIS, XML, etc that may exceed resources available in
their VM infrastructure).

From your description, your database sounds fairly standard with no
unusual requirements. The number of concurrent users is low and it
sounds like it may be a new application where you probably don't yet
know where performance or resource bottlenecks will be. A standard Linux
server with 16+Gb memory and a couple of Gb for storage running PG 9.6
or higher is likely to be a reasonable starting point.

It would also be a good idea to speak to the IT team and see if they
have any procedures/policies for requesting resources. Ask them what
info they need to know and then gather that. It is unlikely to help if
yuou specify hardware requirements they cannot easily support,
especially if those requirements are really just arbitrary and based on
external recommendations from people who don't know what the
infrastructure is. Nothing frustrates IT teams more than being require
to provide over specified systems which consume valuable resources that
are never used or demand custom configurations which are unnecessary and
just add to their maintenance overheads.

Tim

David Gauthier <davegauthierpg@gmail.com> writes:

Hi:

I'm going to be requesting a PG instance supported by an IT team in a large
corp. They will be creating the server as a VM. We will be loading the DB
using scripts (perl/dbi) on linux, possibly using bulk loading techniques
if that's required. Queries will come from both linux and the web, but
typically the number of concurrent users will be on the order of 10 reads,
maybe a couple writers. < 1T total disk, no partitioning. I will be
requesting PITR.

I need to pick a PG version in my request. I want something that will be
stable and reliable while, of course, being able to perform well. What
would be a good choice for PG version ?

Also, since the server will be a VM, are there any special
recommendations/suggestions might I forward in the request (install
options, tuning options, other) ?

Thanks !

--
Tim Cross

#7Ron
ronljohnsonjr@gmail.com
In reply to: David Gauthier (#5)
Re: PG version recommendation

No, I'm asking about the application.

On 5/7/19 3:51 PM, David Gauthier wrote:

Home-rolled application, or third party?

Are you asking about how they do VMs ?
They already provide PG v9.6.7 , so I gather they're not averse to
supporting PG DBs.

On Tue, May 7, 2019 at 4:05 PM Ron <ronljohnsonjr@gmail.com
<mailto:ronljohnsonjr@gmail.com>> wrote:

On 5/7/19 1:52 PM, David Gauthier wrote:

Hi:

I'm going to be requesting a PG instance supported by an IT team in a
large corp.  They will be creating the server as a VM.  We will be
loading the DB using scripts (perl/dbi) on linux, possibly using bulk
loading techniques if that's required.  Queries will come from both
linux and the web, but typically the number of concurrent users will
be on the order of 10 reads, maybe a couple writers.  < 1T total
disk, no partitioning.  I will be requesting PITR.

I need to pick a PG version in my request.  I want something that
will be stable and reliable while, of course, being able to perform
well.  What would be a good choice for PG version ?

Also, since the server will be a VM, are there any special
recommendations/suggestions might I forward in the request (install
options, tuning options, other) ?

Home-rolled application, or third party?  If third party, they'll have
something to say about supported version numbers.

--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

#8Ian Lawrence Barwick
barwick@gmail.com
In reply to: Tim Cross (#6)
Re: PG version recommendation

On Wed, 8 May 2019 at 07:19, Tim Cross <theophilusx@gmail.com> wrote:

I would find out if the IT team who will maintain the system are running
a specific Linux distribution, such as RHEL and just go with the PG
version that is on that distribution.

Bear in mind, depending on the OS, the default version available may be
chronically outdated and no longer supported by the community. For example
the CentOS 7 out-of-the-box PostgreSQL version is 9.2.24.

Regards

Ian Barwick

--
Ian Barwick https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#9Julie Nishimura
juliezain@hotmail.com
In reply to: Ian Lawrence Barwick (#8)
postgresql 9.4 restart

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no problem, but yesterday we saw this error from one of our services that connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command”,

Then I could see that postgresql got restarted, based on the output of "select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Julie Nishimura (#9)
Re: postgresql 9.4 restart

On 5/8/19 11:00 AM, Julie Nishimura wrote:

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no
problem, but yesterday we saw this error from one of our services that
connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command�,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?

Then I could see that postgresql got restarted, based on the output of
"select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

--
Adrian Klaver
adrian.klaver@aklaver.com

#11Julie Nishimura
juliezain@hotmail.com
In reply to: Adrian Klaver (#10)
Re: postgresql 9.4 restart

Adrian, thanks for your reply!

PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1), compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:
2019-05-07 00:08:26.771 UTC,,,8677,"10.200.193.58:49428",5cd0cc7a.21e5,1,"",2019-05-07 00:08:26 UTC,,0,LOG,00000,"connection received: host=10.200.193.58 port=49428",,,,,,,,,""
2019-05-07 00:08:26.772 UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,2,"authentication",2019-05-07 00:08:26 UTC,448/970,0,LOG,00000,"connection authorized: user=hitwise database=hitwise_us_stg",,,,,,,,,""
2019-05-07 00:08:26.772 UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are reserved for non-replication superuser connections",,,,,,,,,""
2019-05-07 00:08:26.778 UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021 user=hitwise database=hitwise_us_stg host=10.200.193.58 port=49424",,,,,,,,,""
2019-05-07 00:08:26.779 UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018 user=hitwise database=hitwise_us_stg host=10.200.193.58 port=49426",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17 UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17 UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339 user=hitwise database=hitwise_uk_stg host=10.200.193.58 port=49246",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215 user=hitwise database=hitwise_au_stg host=10.200.250.49 port=31100",,,,,,,,,""

Does it help?

________________________________
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, May 8, 2019 12:07 PM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart

On 5/8/19 11:00 AM, Julie Nishimura wrote:

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no
problem, but yesterday we saw this error from one of our services that
connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command”,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?

Then I could see that postgresql got restarted, based on the output of
"select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Julie Nishimura (#11)
Re: postgresql 9.4 restart

On 5/8/19 2:01 PM, Julie Nishimura wrote:

Adrian, thanks for your reply!

PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:

UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
reserved for non-replication superuser connections",,,,,,,,,""
2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?

UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49424",,,,,,,,,""
2019-05-07 00:08:26.779
UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49426",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
user=hitwise database=hitwise_uk_stg host=10.200.193.58
port=49246",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
user=hitwise database=hitwise_au_stg host=10.200.250.49
port=31100",,,,,,,,,""

Does it help?

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@aklaver.com>
*Sent:* Wednesday, May 8, 2019 12:07 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@lists.postgresql.org; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 11:00 AM, Julie Nishimura wrote:

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no
problem, but yesterday we saw this error from one of our services that
connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command�,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?

Then I could see that postgresql got restarted, based on the output of
"select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#13Julie Nishimura
juliezain@hotmail.com
In reply to: Adrian Klaver (#12)
Re: postgresql 9.4 restart

Thanks again for your help.

1) max_connections is set to 500 in our config file
2) our average conn load is 300-400, however, I am not sure if it was close to max_conn at the moment of restart
3) we do not have replication on this staging server
4) These are the snippets from auth logs with the corresponding time:

/var/message/syslog:

May 7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin +480 -exec bzip2 {} \;)
May 7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
May 7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
May 7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST (root)
May 7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST (postgres)
May 7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]: (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure changed 'purged' to 'present'
May 7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished catalog run in 106.68 seconds
May 7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

this is from auth.log:

May 7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May 7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session closed for user postgres
May 7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May 7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session closed for user postgres
May 7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection closed by ::1 port 49936 [preauth]
May 7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May 7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session closed for user postgres
May 7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection closed by ::1 port 49942 [preauth]
May 7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session closed for user root
May 7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection closed by ::1 port 49948 [preauth]
May 7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection closed by ::1 port 49954 [preauth]
May 7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection closed by ::1 port 49960 [preauth]
May 7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection closed by ::1 port 49966 [preauth]
May 7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection closed by ::1 port 49972 [preauth]
May 7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session closed for user root
May 7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection closed by ::1 port 49978 [preauth]
May 7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session closed for user root
May 7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection closed by ::1 port 49984 [preauth]
May 7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection closed by ::1 port 49990 [preauth]
May 7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection closed by ::1 port 49996 [preauth]
May 7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection closed by ::1 port 50002 [preauth]
May 7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session closed for user root
May 7 00:25:44 hitw-esg-portal-stage-db-1 sudo: root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM pg_database WHERE datistemplate = false AND datname != 'postgres'
May 7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0)
May 7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres
May 7 00:25:51 hitw-esg-portal-stage-db-1 sudo: root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SHOW server_version
May 7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0)
May 7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres
May 7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection closed by ::1 port 50018 [preauth]

For me, it seems like there is some cron job ran shortly after midnight (either from root or through puppet and restarted the server, because maybe it was close to max connections - just a guess). What do you think?

-Julie N

________________________________
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, May 8, 2019 2:17 PM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart

On 5/8/19 2:01 PM, Julie Nishimura wrote:

Adrian, thanks for your reply!

PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:

UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
reserved for non-replication superuser connections",,,,,,,,,""
2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?

UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49424",,,,,,,,,""
2019-05-07 00:08:26.779
UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49426",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
user=hitwise database=hitwise_uk_stg host=10.200.193.58
port=49246",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
user=hitwise database=hitwise_au_stg host=10.200.250.49
port=31100",,,,,,,,,""

Does it help?

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@aklaver.com>
*Sent:* Wednesday, May 8, 2019 12:07 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@lists.postgresql.org; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 11:00 AM, Julie Nishimura wrote:

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no
problem, but yesterday we saw this error from one of our services that
connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command”,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?

Then I could see that postgresql got restarted, based on the output of
"select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#14Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Julie Nishimura (#13)
Re: postgresql 9.4 restart

On 5/8/19 4:35 PM, Julie Nishimura wrote:

Thanks again for your help.

1) max_connections is set to 500 in our config file
2) our average conn load is 300-400, however, I am not sure if it was
close to max_conn at the moment of restart
3) we do not have replication on this staging server
4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.

/var/message/syslog:

May �7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD
(/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin
+480 -exec bzip2 {} \;)
May �7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD
(/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
May �7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May �7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May �7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (
cd / && run-parts --report /etc/cron.hourly)
May �7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May �7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST
(root)
May �7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST
(postgres)
May �7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]:
(/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure
changed 'purged' to 'present'
May �7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished
catalog run in 106.68 seconds
May �7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)

this is from auth.log:

May �7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
pam_unix(cron:session): session opened for user postgres by (uid=0)
May �7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
pam_unix(cron:session): session closed for user postgres
May �7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
pam_unix(cron:session): session opened for user postgres by (uid=0)
May �7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
pam_unix(cron:session): session closed for user postgres
May �7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection
closed by ::1 port 49936 [preauth]
May �7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
pam_unix(cron:session): session opened for user postgres by (uid=0)
May �7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
pam_unix(cron:session): session closed for user postgres
May �7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection
closed by ::1 port 49942 [preauth]
May �7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
pam_unix(cron:session): session opened for user root by (uid=0)
May �7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
pam_unix(cron:session): session closed for user root
May �7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection
closed by ::1 port 49948 [preauth]
May �7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection
closed by ::1 port 49954 [preauth]
May �7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection
closed by ::1 port 49960 [preauth]
May �7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection
closed by ::1 port 49966 [preauth]
May �7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection
closed by ::1 port 49972 [preauth]
May �7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
pam_unix(cron:session): session opened for user root by (uid=0)
May �7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
pam_unix(cron:session): session closed for user root
May �7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection
closed by ::1 port 49978 [preauth]
May �7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
pam_unix(cron:session): session opened for user root by (uid=0)
May �7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
pam_unix(cron:session): session closed for user root
May �7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection
closed by ::1 port 49984 [preauth]
May �7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection
closed by ::1 port 49990 [preauth]
May �7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection
closed by ::1 port 49996 [preauth]
May �7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection
closed by ::1 port 50002 [preauth]
May �7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
pam_unix(cron:session): session opened for user root by (uid=0)
May �7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
pam_unix(cron:session): session closed for user root
May �7 00:25:44 hitw-esg-portal-stage-db-1 sudo: � � root : TTY=unknown
; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM
pg_database WHERE datistemplate = false AND datname != 'postgres'
May �7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session opened for user postgres by (uid=0)
May �7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session closed for user postgres
May �7 00:25:51 hitw-esg-portal-stage-db-1 sudo: � � root : TTY=unknown
; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
format=unaligned -c SHOW server_version
May �7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session opened for user postgres by (uid=0)
May �7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session closed for user postgres
May �7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection
closed by ::1 port 50018 [preauth]

For me, it seems like there is some cron job ran shortly after midnight
(either from root or through puppet and restarted the server, because
maybe it was close to max connections - just a guess). What do you think?

-Julie N

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@aklaver.com>
*Sent:* Wednesday, May 8, 2019 2:17 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@lists.postgresql.org; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 2:01 PM, Julie Nishimura wrote:

Adrian, thanks for your reply!

PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:

UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
reserved for non-replication superuser connections",,,,,,,,,""
2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?

UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49424",,,,,,,,,""
2019-05-07 00:08:26.779
UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49426",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
user=hitwise database=hitwise_uk_stg host=10.200.193.58
port=49246",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
user=hitwise database=hitwise_au_stg host=10.200.250.49
port=31100",,,,,,,,,""

Does it help?

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@aklaver.com>
*Sent:* Wednesday, May 8, 2019 12:07 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@lists.postgresql.org; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 11:00 AM, Julie Nishimura wrote:

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no
problem, but yesterday we saw this error from one of our services that
connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command�,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?

Then I could see that postgresql got restarted, based on the output of
"select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#15Julie Nishimura
juliezain@hotmail.com
In reply to: Adrian Klaver (#14)
Re: postgresql 9.4 restart

Etc/UTC

________________________________
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, May 9, 2019 7:51 AM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart

On 5/8/19 4:35 PM, Julie Nishimura wrote:

Thanks again for your help.

1) max_connections is set to 500 in our config file
2) our average conn load is 300-400, however, I am not sure if it was
close to max_conn at the moment of restart
3) we do not have replication on this staging server
4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.

/var/message/syslog:

May 7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD
(/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin
+480 -exec bzip2 {} \;)
May 7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD
(/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
May 7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (
cd / && run-parts --report /etc/cron.hourly)
May 7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May 7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST
(root)
May 7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST
(postgres)
May 7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]:
(/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure
changed 'purged' to 'present'
May 7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished
catalog run in 106.68 seconds
May 7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD
(command -v debian-sa1 > /dev/null && debian-sa1 1 1)

this is from auth.log:

May 7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
pam_unix(cron:session): session opened for user postgres by (uid=0)
May 7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
pam_unix(cron:session): session closed for user postgres
May 7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
pam_unix(cron:session): session opened for user postgres by (uid=0)
May 7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
pam_unix(cron:session): session closed for user postgres
May 7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection
closed by ::1 port 49936 [preauth]
May 7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
pam_unix(cron:session): session opened for user postgres by (uid=0)
May 7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
pam_unix(cron:session): session closed for user postgres
May 7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection
closed by ::1 port 49942 [preauth]
May 7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
pam_unix(cron:session): session closed for user root
May 7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection
closed by ::1 port 49948 [preauth]
May 7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection
closed by ::1 port 49954 [preauth]
May 7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection
closed by ::1 port 49960 [preauth]
May 7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection
closed by ::1 port 49966 [preauth]
May 7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection
closed by ::1 port 49972 [preauth]
May 7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
pam_unix(cron:session): session closed for user root
May 7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection
closed by ::1 port 49978 [preauth]
May 7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
pam_unix(cron:session): session closed for user root
May 7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection
closed by ::1 port 49984 [preauth]
May 7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection
closed by ::1 port 49990 [preauth]
May 7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection
closed by ::1 port 49996 [preauth]
May 7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection
closed by ::1 port 50002 [preauth]
May 7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
pam_unix(cron:session): session opened for user root by (uid=0)
May 7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
pam_unix(cron:session): session closed for user root
May 7 00:25:44 hitw-esg-portal-stage-db-1 sudo: root : TTY=unknown
; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM
pg_database WHERE datistemplate = false AND datname != 'postgres'
May 7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session opened for user postgres by (uid=0)
May 7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session closed for user postgres
May 7 00:25:51 hitw-esg-portal-stage-db-1 sudo: root : TTY=unknown
; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
format=unaligned -c SHOW server_version
May 7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session opened for user postgres by (uid=0)
May 7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
session closed for user postgres
May 7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection
closed by ::1 port 50018 [preauth]

For me, it seems like there is some cron job ran shortly after midnight
(either from root or through puppet and restarted the server, because
maybe it was close to max connections - just a guess). What do you think?

-Julie N

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@aklaver.com>
*Sent:* Wednesday, May 8, 2019 2:17 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@lists.postgresql.org; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 2:01 PM, Julie Nishimura wrote:

Adrian, thanks for your reply!

PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:

UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
reserved for non-replication superuser connections",,,,,,,,,""
2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?

UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49424",,,,,,,,,""
2019-05-07 00:08:26.779
UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
user=hitwise database=hitwise_us_stg host=10.200.193.58
port=49426",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
administrator command",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
user=hitwise database=hitwise_uk_stg host=10.200.193.58
port=49246",,,,,,,,,""
2019-05-07 00:08:28.060
UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
user=hitwise database=hitwise_au_stg host=10.200.250.49
port=31100",,,,,,,,,""

Does it help?

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@aklaver.com>
*Sent:* Wednesday, May 8, 2019 12:07 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@lists.postgresql.org; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 11:00 AM, Julie Nishimura wrote:

Hello,
Our staging 9.4 postgresql has been running couple of weeks with no
problem, but yesterday we saw this error from one of our services that
connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command”,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?

Then I could see that postgresql got restarted, based on the output of
"select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Adrian Klaver
adrian.klaver@aklaver.com

#16Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Julie Nishimura (#15)
Re: postgresql 9.4 restart

On 5/9/19 8:10 AM, Julie Nishimura wrote:

Etc/UTC

Well there goes that theory:(
Not sure where to go from here, other then monitor the logs and
connections more aggressively.

--
Adrian Klaver
adrian.klaver@aklaver.com

#17Julie Nishimura
juliezain@hotmail.com
In reply to: Adrian Klaver (#16)
Re: postgresql 9.4 restart

Thanks for your help, Adrian!

________________________________
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, May 9, 2019 9:41 AM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart

On 5/9/19 8:10 AM, Julie Nishimura wrote:

Etc/UTC

Well there goes that theory:(
Not sure where to go from here, other then monitor the logs and
connections more aggressively.

--
Adrian Klaver
adrian.klaver@aklaver.com

#18Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Ian Lawrence Barwick (#8)
Re: PG version recommendation

On 2019-05-08 13:41:08 +0900, Ian Barwick wrote:

On Wed, 8 May 2019 at 07:19, Tim Cross <theophilusx@gmail.com> wrote:

I would find out if the IT team who will maintain the system are running
a specific Linux distribution, such as RHEL and just go with the PG
version that is on that distribution.

Bear in mind, depending on the OS, the default version available may be
chronically outdated and no longer supported by the community. For example
the CentOS 7 out-of-the-box PostgreSQL version is 9.2.24.

RHEL 8 was just released and contains PostgreSQL 10. So depending on how
long the IT team takes to integrate a new base OS version into their
standard procedures, that might be the best option.

RHEL also has "software collections" (which are new to me although the
guide is from 2013 - so you can guess how long it's been that I last
administrated RHEL systems) which include PostgreSQL 10 for RHEL 7
according to https://www.postgresql.org/download/linux/redhat/.

Of course, if the IT team is willing to use the PostgreSQL yum
repository, then that is even better.

All this of course assumes that the IT team insists on Redhat, which the
OP hasn't said. If there is a wider varietyl of distributions to choose
from, my preference would be Debian or Ubuntu (in that order).

hp

--
_ | Peter J. Holzer | we build much bigger, better disasters now
|_|_) | | because we have much more sophisticated
| | | hjp@hjp.at | management tools.
__/ | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/&gt;