Thread: PG version recommendation

PG version recommendation

From
David Gauthier
Date:
Hi:

I'm going to be requesting a PG instance supported by an IT team in a large corp.  They will be creating the server as a VM.  We will be loading the DB using scripts (perl/dbi) on linux, possibly using bulk loading techniques if that's required.  Queries will come from both linux and the web, but typically the number of concurrent users will be on the order of 10 reads, maybe a couple writers.  < 1T total disk, no partitioning.  I will be requesting PITR.  

I need to pick a PG version in my request.  I want something that will be stable and reliable while, of course, being able to perform well.  What would be a good choice for PG version ?

Also, since the server will be a VM, are there any special recommendations/suggestions might I forward in the request (install options, tuning options, other) ?

Thanks !
 

Re: PG version recommendation

From
Adrian Klaver
Date:
On 5/7/19 11:52 AM, David Gauthier wrote:
> Hi:
> 
> I'm going to be requesting a PG instance supported by an IT team in a 
> large corp.  They will be creating the server as a VM.  We will be 
> loading the DB using scripts (perl/dbi) on linux, possibly using bulk 
> loading techniques if that's required.  Queries will come from both 
> linux and the web, but typically the number of concurrent users will be 
> on the order of 10 reads, maybe a couple writers.  < 1T total disk, no 
> partitioning.  I will be requesting PITR.
> 
> I need to pick a PG version in my request.  I want something that will 
> be stable and reliable while, of course, being able to perform well.  
> What would be a good choice for PG version ?

The latest production release v 11. You will have community support 
until November 2023:

https://www.postgresql.org/support/versioning/

As will all suggestions test under your conditions to verify.


> 
> Also, since the server will be a VM, are there any special 
> recommendations/suggestions might I forward in the request (install 
> options, tuning options, other) ?
> 
> Thanks !


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: PG version recommendation

From
Thomas Kellerer
Date:
David Gauthier schrieb am 07.05.2019 um 20:52:
> I need to pick a PG version in my request.  I want something that
> will be stable and reliable while, of course, being able to perform
> well.  What would be a good choice for PG version?

There is no reason to not choose the latest version.
So I would suggest to pick Postgres 11





Re: PG version recommendation

From
Ron
Date:
On 5/7/19 1:52 PM, David Gauthier wrote:
Hi:

I'm going to be requesting a PG instance supported by an IT team in a large corp.  They will be creating the server as a VM.  We will be loading the DB using scripts (perl/dbi) on linux, possibly using bulk loading techniques if that's required.  Queries will come from both linux and the web, but typically the number of concurrent users will be on the order of 10 reads, maybe a couple writers.  < 1T total disk, no partitioning.  I will be requesting PITR.  

I need to pick a PG version in my request.  I want something that will be stable and reliable while, of course, being able to perform well.  What would be a good choice for PG version ?

Also, since the server will be a VM, are there any special recommendations/suggestions might I forward in the request (install options, tuning options, other) ?

Home-rolled application, or third party?  If third party, they'll have something to say about supported version numbers.


--
Angular momentum makes the world go 'round.

Re: PG version recommendation

From
David Gauthier
Date:
>>Home-rolled application, or third party? 
Are you asking about how they do VMs ?
They already provide PG v9.6.7 , so I gather they're not averse to supporting PG DBs. 


On Tue, May 7, 2019 at 4:05 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 5/7/19 1:52 PM, David Gauthier wrote:
Hi:

I'm going to be requesting a PG instance supported by an IT team in a large corp.  They will be creating the server as a VM.  We will be loading the DB using scripts (perl/dbi) on linux, possibly using bulk loading techniques if that's required.  Queries will come from both linux and the web, but typically the number of concurrent users will be on the order of 10 reads, maybe a couple writers.  < 1T total disk, no partitioning.  I will be requesting PITR.  

I need to pick a PG version in my request.  I want something that will be stable and reliable while, of course, being able to perform well.  What would be a good choice for PG version ?

Also, since the server will be a VM, are there any special recommendations/suggestions might I forward in the request (install options, tuning options, other) ?

Home-rolled application, or third party?  If third party, they'll have something to say about supported version numbers.


--
Angular momentum makes the world go 'round.

Re: PG version recommendation

From
Tim Cross
Date:
I would find out if the IT team who will maintain the system are running
a specific Linux distribution, such as RHEL and just go with the PG
version that is on that distribution.

Large corp rarely have sufficient IT resources. Unless you specifically
need a particular PG version (which does not seem to be the case based
on your info), you are better off sticking with the version provided by
whatever distro they use. This will ensure reasonable updates and
patches. In corp environments, where IT resources are thin on the
ground, any custom install often results in poor patching and update
cycles because it falls outside 'standard procedures'.

With respect to hardware specifications, it really depends a lot on what
the infrastructure is. Typically, you would be better off specifying the
performance and storage (size) you require and leave it to the IT team
to work out how to best satisfy those requirements e.g. support x
concurrent connections, y Tb/Gb of storage, backup requirements i.e. how
frequent, how many versions and retention requirements. Include details
of any additional PG packages you may need/want and how/where the
database will need to be accessed from.

As you indicate the host will be a VM, it should be relatively easy to
scale up/down cpus or memory as required, unless you have special
requirements (very complex queries, very large data sets, complex data
models involving GIS, XML, etc that may exceed resources available in
their VM infrastructure).

From your description, your database sounds fairly standard with no
unusual requirements. The number of concurrent users is low and it
sounds like it may be a new application where you probably don't yet
know where performance or resource bottlenecks will be. A standard Linux
server with 16+Gb memory and a couple of Gb for storage running PG 9.6
or higher is likely to be a reasonable starting point.

It would also be a good idea to speak to the IT team and see if they
have any procedures/policies for requesting resources. Ask them what
info they need to know and then gather that. It is unlikely to help if
yuou specify hardware requirements they cannot easily support,
especially if those requirements are really just arbitrary and based on
external recommendations from people who don't know what the
infrastructure is. Nothing frustrates IT teams more than being require
to provide over specified systems which consume valuable resources that
are never used or demand custom configurations which are unnecessary and
just add to their maintenance overheads. 

Tim

David Gauthier <davegauthierpg@gmail.com> writes:

> Hi:
>
> I'm going to be requesting a PG instance supported by an IT team in a large
> corp.  They will be creating the server as a VM.  We will be loading the DB
> using scripts (perl/dbi) on linux, possibly using bulk loading techniques
> if that's required.  Queries will come from both linux and the web, but
> typically the number of concurrent users will be on the order of 10 reads,
> maybe a couple writers.  < 1T total disk, no partitioning.  I will be
> requesting PITR.
>
> I need to pick a PG version in my request.  I want something that will be
> stable and reliable while, of course, being able to perform well.  What
> would be a good choice for PG version ?
>
> Also, since the server will be a VM, are there any special
> recommendations/suggestions might I forward in the request (install
> options, tuning options, other) ?
>
> Thanks !


--
Tim Cross



Re: PG version recommendation

From
Ron
Date:
No, I'm asking about the application.

On 5/7/19 3:51 PM, David Gauthier wrote:
>>Home-rolled application, or third party? 
Are you asking about how they do VMs ?
They already provide PG v9.6.7 , so I gather they're not averse to supporting PG DBs. 


On Tue, May 7, 2019 at 4:05 PM Ron <ronljohnsonjr@gmail.com> wrote:
On 5/7/19 1:52 PM, David Gauthier wrote:
Hi:

I'm going to be requesting a PG instance supported by an IT team in a large corp.  They will be creating the server as a VM.  We will be loading the DB using scripts (perl/dbi) on linux, possibly using bulk loading techniques if that's required.  Queries will come from both linux and the web, but typically the number of concurrent users will be on the order of 10 reads, maybe a couple writers.  < 1T total disk, no partitioning.  I will be requesting PITR.  

I need to pick a PG version in my request.  I want something that will be stable and reliable while, of course, being able to perform well.  What would be a good choice for PG version ?

Also, since the server will be a VM, are there any special recommendations/suggestions might I forward in the request (install options, tuning options, other) ?

Home-rolled application, or third party?  If third party, they'll have something to say about supported version numbers.


--
Angular momentum makes the world go 'round.

--
Angular momentum makes the world go 'round.

Re: PG version recommendation

From
Ian Barwick
Date:
On Wed, 8 May 2019 at 07:19, Tim Cross <theophilusx@gmail.com> wrote:

I would find out if the IT team who will maintain the system are running
a specific Linux distribution, such as RHEL and just go with the PG
version that is on that distribution.

Bear in mind, depending on the OS, the default version available may be
chronically outdated and no longer supported by the community. For example
the CentOS 7 out-of-the-box PostgreSQL version is 9.2.24.


Regards

Ian Barwick

--
  Ian Barwick                   https://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

postgresql 9.4 restart

From
Julie Nishimura
Date:
Hello,
Our staging 9.4 postgresql has been running couple of weeks with no problem, but yesterday we saw this error from one of our services that connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command”,

Then I could see that postgresql got restarted, based on the output of "select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N


Re: postgresql 9.4 restart

From
Adrian Klaver
Date:
On 5/8/19 11:00 AM, Julie Nishimura wrote:
> Hello,
> Our staging 9.4 postgresql has been running couple of weeks with no 
> problem, but yesterday we saw this error from one of our services that 
> connects to the staging DBs:
> 
> FATAL: terminating connection due to
> administrator command; nested exception is
> org.postgresql.util.PSQLException: FATAL: terminating connection due to
> administrator command”,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating 
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer 
stopping Postgres?



> 
> Then I could see that postgresql got restarted, based on the output of 
> "select pg_postmaster_start_time()"
> 
> We are running postgresql on
> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 
> x86_64 x86_64 GNU/Linux
> 
> How I can find out what cause the restart? Any ideas are welcome. Thanks!
> 
> -Julie N
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: postgresql 9.4 restart

From
Julie Nishimura
Date:
Adrian, thanks for your reply!



PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1), compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:
2019-05-07 00:08:26.771 UTC,,,8677,"10.200.193.58:49428",5cd0cc7a.21e5,1,"",2019-05-07 00:08:26 UTC,,0,LOG,00000,"connection received: host=10.200.193.58 port=49428",,,,,,,,,""
2019-05-07 00:08:26.772 UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,2,"authentication",2019-05-07 00:08:26 UTC,448/970,0,LOG,00000,"connection authorized: user=hitwise database=hitwise_us_stg",,,,,,,,,""
2019-05-07 00:08:26.772 UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are reserved for non-replication superuser connections",,,,,,,,,""
2019-05-07 00:08:26.778 UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021 user=hitwise database=hitwise_us_stg host=10.200.193.58 port=49424",,,,,,,,,""
2019-05-07 00:08:26.779 UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018 user=hitwise database=hitwise_us_stg host=10.200.193.58 port=49426",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17 UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17 UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339 user=hitwise database=hitwise_uk_stg host=10.200.193.58 port=49246",,,,,,,,,""
2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215 user=hitwise database=hitwise_au_stg host=10.200.250.49 port=31100",,,,,,,,,""

Does it help?


From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, May 8, 2019 12:07 PM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart
 
On 5/8/19 11:00 AM, Julie Nishimura wrote:
> Hello,
> Our staging 9.4 postgresql has been running couple of weeks with no
> problem, but yesterday we saw this error from one of our services that
> connects to the staging DBs:
>
> FATAL: terminating connection due to
> administrator command; nested exception is
> org.postgresql.util.PSQLException: FATAL: terminating connection due to
> administrator command”,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?



>
> Then I could see that postgresql got restarted, based on the output of
> "select pg_postmaster_start_time()"
>
> We are running postgresql on
> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
> x86_64 x86_64 GNU/Linux
>
> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>
> -Julie N
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: postgresql 9.4 restart

From
Adrian Klaver
Date:
On 5/8/19 2:01 PM, Julie Nishimura wrote:
> Adrian, thanks for your reply!
> 
> 
> 
> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1), 
> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
> 
> This is what Ive found in logs prior to the restart:

> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07 
> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are 
> reserved for non-replication superuser connections",,,,,,,,,""
> 2019-05-07 00:08:26.778 

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?



> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07 
> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021 
> user=hitwise database=hitwise_us_stg host=10.200.193.58 
> port=49424",,,,,,,,,""
> 2019-05-07 00:08:26.779 
> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07 
> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018 
> user=hitwise database=hitwise_us_stg host=10.200.193.58 
> port=49426",,,,,,,,,""
> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17 
> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17 
> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07 
> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to 
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07 
> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to 
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07 
> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to 
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07 
> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to 
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07 
> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to 
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07 
> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to 
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07 
> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339 
> user=hitwise database=hitwise_uk_stg host=10.200.193.58 
> port=49246",,,,,,,,,""
> 2019-05-07 00:08:28.060 
> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07 
> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215 
> user=hitwise database=hitwise_au_stg host=10.200.250.49 
> port=31100",,,,,,,,,""
> 
> Does it help?
> 
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Wednesday, May 8, 2019 12:07 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>> Hello,
>> Our staging 9.4 postgresql has been running couple of weeks with no 
>> problem, but yesterday we saw this error from one of our services that 
>> connects to the staging DBs:
>> 
>> FATAL: terminating connection due to
>> administrator command; nested exception is
>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>> administrator command”,
> 
> Distro and version?
> 
> Well it originated in Java code if that helps?
> 
> Anything in the Postgres logs just prior to 'FATAL: terminating
> connection ...' that might be relevant?
> 
> Do the system logs showing anything relevant e.g. the OOM killer
> stopping Postgres?
> 
> 
> 
>> 
>> Then I could see that postgresql got restarted, based on the output of 
>> "select pg_postmaster_start_time()"
>> 
>> We are running postgresql on
>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 
>> x86_64 x86_64 GNU/Linux
>> 
>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>> 
>> -Julie N
>> 
>> 
> 
> 
> -- 
> Adrian Klaver
> adrian.klaver@aklaver.com


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: postgresql 9.4 restart

From
Julie Nishimura
Date:
Thanks again for your help.

1) max_connections is set to 500 in our config file
2) our average conn load is 300-400, however, I am not sure if it was close to max_conn at the moment of restart
3) we do not have replication on this staging server
4) These are the snippets from auth logs with the corresponding time:

/var/message/syslog:

May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin +480 -exec bzip2 {} \;)
May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST (root)
May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST (postgres)
May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]: (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure changed 'purged' to 'present'
May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished catalog run in 106.68 seconds
May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

this is from auth.log:

May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session closed for user postgres
May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session closed for user postgres
May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection closed by ::1 port 49936 [preauth]
May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session closed for user postgres
May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection closed by ::1 port 49942 [preauth]
May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session closed for user root
May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection closed by ::1 port 49948 [preauth]
May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection closed by ::1 port 49954 [preauth]
May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection closed by ::1 port 49960 [preauth]
May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection closed by ::1 port 49966 [preauth]
May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection closed by ::1 port 49972 [preauth]
May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session closed for user root
May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection closed by ::1 port 49978 [preauth]
May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session closed for user root
May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection closed by ::1 port 49984 [preauth]
May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection closed by ::1 port 49990 [preauth]
May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection closed by ::1 port 49996 [preauth]
May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection closed by ::1 port 50002 [preauth]
May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session closed for user root
May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM pg_database WHERE datistemplate = false AND datname != 'postgres'
May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0)
May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres
May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SHOW server_version
May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0)
May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres
May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection closed by ::1 port 50018 [preauth]


For me, it seems like there is some cron job ran shortly after midnight (either from root or through puppet and restarted the server, because maybe it was close to max connections - just a guess). What do you think?

-Julie N



From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, May 8, 2019 2:17 PM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart
 
On 5/8/19 2:01 PM, Julie Nishimura wrote:
> Adrian, thanks for your reply!
>
>
>
> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
>
> This is what Ive found in logs prior to the restart:

> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
> reserved for non-replication superuser connections",,,,,,,,,""
> 2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?



> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
> user=hitwise database=hitwise_us_stg host=10.200.193.58
> port=49424",,,,,,,,,""
> 2019-05-07 00:08:26.779
> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
> user=hitwise database=hitwise_us_stg host=10.200.193.58
> port=49426",,,,,,,,,""
> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
> user=hitwise database=hitwise_uk_stg host=10.200.193.58
> port=49246",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
> user=hitwise database=hitwise_au_stg host=10.200.250.49
> port=31100",,,,,,,,,""
>
> Does it help?
>
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Wednesday, May 8, 2019 12:07 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>> Hello,
>> Our staging 9.4 postgresql has been running couple of weeks with no
>> problem, but yesterday we saw this error from one of our services that
>> connects to the staging DBs:
>>
>> FATAL: terminating connection due to
>> administrator command; nested exception is
>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>> administrator command”,
>
> Distro and version?
>
> Well it originated in Java code if that helps?
>
> Anything in the Postgres logs just prior to 'FATAL: terminating
> connection ...' that might be relevant?
>
> Do the system logs showing anything relevant e.g. the OOM killer
> stopping Postgres?
>
>
>
>>
>> Then I could see that postgresql got restarted, based on the output of
>> "select pg_postmaster_start_time()"
>>
>> We are running postgresql on
>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
>> x86_64 x86_64 GNU/Linux
>>
>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>>
>> -Julie N
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: postgresql 9.4 restart

From
Adrian Klaver
Date:
On 5/8/19 4:35 PM, Julie Nishimura wrote:
> Thanks again for your help.
> 
> 1) max_connections is set to 500 in our config file
> 2) our average conn load is 300-400, however, I am not sure if it was 
> close to max_conn at the moment of restart
> 3) we do not have replication on this staging server
> 4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.

> 
> /var/message/syslog:
> 
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD 
> (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin 
> +480 -exec bzip2 {} \;)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD 
> (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (   
> cd / && run-parts --report /etc/cron.hourly)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST 
> (root)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST 
> (postgres)
> May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]: 
> (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure 
> changed 'purged' to 'present'
> May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished 
> catalog run in 106.68 seconds
> May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> 
> this is from auth.log:
> 
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: 
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: 
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: 
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: 
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection 
> closed by ::1 port 49936 [preauth]
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: 
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: 
> pam_unix(cron:session): session closed for user postgres
> May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection 
> closed by ::1 port 49942 [preauth]
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection 
> closed by ::1 port 49948 [preauth]
> May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection 
> closed by ::1 port 49954 [preauth]
> May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection 
> closed by ::1 port 49960 [preauth]
> May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection 
> closed by ::1 port 49966 [preauth]
> May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection 
> closed by ::1 port 49972 [preauth]
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection 
> closed by ::1 port 49978 [preauth]
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection 
> closed by ::1 port 49984 [preauth]
> May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection 
> closed by ::1 port 49990 [preauth]
> May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection 
> closed by ::1 port 49996 [preauth]
> May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection 
> closed by ::1 port 50002 [preauth]
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown 
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P 
> format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM 
> pg_database WHERE datistemplate = false AND datname != 'postgres'
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session opened for user postgres by (uid=0)
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session closed for user postgres
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown 
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P 
> format=unaligned -c SHOW server_version
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session opened for user postgres by (uid=0)
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session closed for user postgres
> May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection 
> closed by ::1 port 50018 [preauth]
> 
> 
> For me, it seems like there is some cron job ran shortly after midnight 
> (either from root or through puppet and restarted the server, because 
> maybe it was close to max connections - just a guess). What do you think?
> 
> -Julie N
> 
> 
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Wednesday, May 8, 2019 2:17 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 2:01 PM, Julie Nishimura wrote:
>> Adrian, thanks for your reply!
>> 
>> 
>> 
>> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1), 
>> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
>> 
>> This is what Ive found in logs prior to the restart:
> 
>> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
>> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are 
>> reserved for non-replication superuser connections",,,,,,,,,""
>> 2019-05-07 00:08:26.778 
> 
> The above is not good.
> 
> What do you have set for max_connections?
> 
> How many connections are you trying against the database?
> 
> Do you have replication set up to other Postgres instances(uk, au, us)?
> 
> What does the system log show from the same time period?
> 
> 
> 
>> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021 
>> user=hitwise database=hitwise_us_stg host=10.200.193.58 
>> port=49424",,,,,,,,,""
>> 2019-05-07 00:08:26.779 
>> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018 
>> user=hitwise database=hitwise_us_stg host=10.200.193.58 
>> port=49426",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17 
>> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
>> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
>> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
>> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
>> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
>> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
>> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
>> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
>> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339 
>> user=hitwise database=hitwise_uk_stg host=10.200.193.58 
>> port=49246",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
>> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215 
>> user=hitwise database=hitwise_au_stg host=10.200.250.49 
>> port=31100",,,,,,,,,""
>> 
>> Does it help?
>> 
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Wednesday, May 8, 2019 12:07 PM
>> *To:* Julie Nishimura
>> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
>> *Subject:* Re: postgresql 9.4 restart
>> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>>> Hello,
>>> Our staging 9.4 postgresql has been running couple of weeks with no 
>>> problem, but yesterday we saw this error from one of our services that 
>>> connects to the staging DBs:
>>> 
>>> FATAL: terminating connection due to
>>> administrator command; nested exception is
>>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>>> administrator command”,
>> 
>> Distro and version?
>> 
>> Well it originated in Java code if that helps?
>> 
>> Anything in the Postgres logs just prior to 'FATAL: terminating
>> connection ...' that might be relevant?
>> 
>> Do the system logs showing anything relevant e.g. the OOM killer
>> stopping Postgres?
>> 
>> 
>> 
>>> 
>>> Then I could see that postgresql got restarted, based on the output of 
>>> "select pg_postmaster_start_time()"
>>> 
>>> We are running postgresql on
>>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 
>>> x86_64 x86_64 GNU/Linux
>>> 
>>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>>> 
>>> -Julie N
>>> 
>>> 
>> 
>> 
>> -- 
>> Adrian Klaver
>> adrian.klaver@aklaver.com
> 
> 
> -- 
> Adrian Klaver
> adrian.klaver@aklaver.com


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: postgresql 9.4 restart

From
Julie Nishimura
Date:
Etc/UTC


From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, May 9, 2019 7:51 AM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart
 
On 5/8/19 4:35 PM, Julie Nishimura wrote:
> Thanks again for your help.
>
> 1) max_connections is set to 500 in our config file
> 2) our average conn load is 300-400, however, I am not sure if it was
> close to max_conn at the moment of restart
> 3) we do not have replication on this staging server
> 4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.

>
> /var/message/syslog:
>
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD
> (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin
> +480 -exec bzip2 {} \;)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD
> (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (  
> cd / && run-parts --report /etc/cron.hourly)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST
> (root)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST
> (postgres)
> May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]:
> (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure
> changed 'purged' to 'present'
> May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished
> catalog run in 106.68 seconds
> May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
>
> this is from auth.log:
>
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection
> closed by ::1 port 49936 [preauth]
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
> pam_unix(cron:session): session closed for user postgres
> May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection
> closed by ::1 port 49942 [preauth]
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
> pam_unix(cron:session): session closed for user root
> May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection
> closed by ::1 port 49948 [preauth]
> May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection
> closed by ::1 port 49954 [preauth]
> May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection
> closed by ::1 port 49960 [preauth]
> May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection
> closed by ::1 port 49966 [preauth]
> May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection
> closed by ::1 port 49972 [preauth]
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
> pam_unix(cron:session): session closed for user root
> May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection
> closed by ::1 port 49978 [preauth]
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
> pam_unix(cron:session): session closed for user root
> May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection
> closed by ::1 port 49984 [preauth]
> May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection
> closed by ::1 port 49990 [preauth]
> May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection
> closed by ::1 port 49996 [preauth]
> May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection
> closed by ::1 port 50002 [preauth]
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
> pam_unix(cron:session): session closed for user root
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
> format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM
> pg_database WHERE datistemplate = false AND datname != 'postgres'
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session opened for user postgres by (uid=0)
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session closed for user postgres
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
> format=unaligned -c SHOW server_version
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session opened for user postgres by (uid=0)
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session closed for user postgres
> May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection
> closed by ::1 port 50018 [preauth]
>
>
> For me, it seems like there is some cron job ran shortly after midnight
> (either from root or through puppet and restarted the server, because
> maybe it was close to max connections - just a guess). What do you think?
>
> -Julie N
>
>
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Wednesday, May 8, 2019 2:17 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 2:01 PM, Julie Nishimura wrote:
>> Adrian, thanks for your reply!
>>
>>
>>
>> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
>> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
>>
>> This is what Ive found in logs prior to the restart:
>
>> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
>> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
>> reserved for non-replication superuser connections",,,,,,,,,""
>> 2019-05-07 00:08:26.778
>
> The above is not good.
>
> What do you have set for max_connections?
>
> How many connections are you trying against the database?
>
> Do you have replication set up to other Postgres instances(uk, au, us)?
>
> What does the system log show from the same time period?
>
>
>
>> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
>> user=hitwise database=hitwise_us_stg host=10.200.193.58
>> port=49424",,,,,,,,,""
>> 2019-05-07 00:08:26.779
>> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
>> user=hitwise database=hitwise_us_stg host=10.200.193.58
>> port=49426",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
>> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
>> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
>> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
>> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
>> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
>> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
>> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
>> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
>> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
>> user=hitwise database=hitwise_uk_stg host=10.200.193.58
>> port=49246",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
>> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
>> user=hitwise database=hitwise_au_stg host=10.200.250.49
>> port=31100",,,,,,,,,""
>>
>> Does it help?
>>
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Wednesday, May 8, 2019 12:07 PM
>> *To:* Julie Nishimura
>> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
>> *Subject:* Re: postgresql 9.4 restart
>> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>>> Hello,
>>> Our staging 9.4 postgresql has been running couple of weeks with no
>>> problem, but yesterday we saw this error from one of our services that
>>> connects to the staging DBs:
>>>
>>> FATAL: terminating connection due to
>>> administrator command; nested exception is
>>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>>> administrator command”,
>>
>> Distro and version?
>>
>> Well it originated in Java code if that helps?
>>
>> Anything in the Postgres logs just prior to 'FATAL: terminating
>> connection ...' that might be relevant?
>>
>> Do the system logs showing anything relevant e.g. the OOM killer
>> stopping Postgres?
>>
>>
>>
>>>
>>> Then I could see that postgresql got restarted, based on the output of
>>> "select pg_postmaster_start_time()"
>>>
>>> We are running postgresql on
>>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
>>> x86_64 x86_64 GNU/Linux
>>>
>>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>>>
>>> -Julie N
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: postgresql 9.4 restart

From
Adrian Klaver
Date:
On 5/9/19 8:10 AM, Julie Nishimura wrote:
> Etc/UTC

Well there goes that theory:(
Not sure where to go from here, other then monitor the logs and 
connections more aggressively.


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: postgresql 9.4 restart

From
Julie Nishimura
Date:
Thanks for your help, Adrian!


From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Thursday, May 9, 2019 9:41 AM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart
 
On 5/9/19 8:10 AM, Julie Nishimura wrote:
> Etc/UTC

Well there goes that theory:(
Not sure where to go from here, other then monitor the logs and
connections more aggressively.


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: PG version recommendation

From
"Peter J. Holzer"
Date:
On 2019-05-08 13:41:08 +0900, Ian Barwick wrote:
> On Wed, 8 May 2019 at 07:19, Tim Cross <theophilusx@gmail.com> wrote:
>
>     I would find out if the IT team who will maintain the system are running
>     a specific Linux distribution, such as RHEL and just go with the PG
>     version that is on that distribution.
>
>
> Bear in mind, depending on the OS, the default version available may be
> chronically outdated and no longer supported by the community. For example
> the CentOS 7 out-of-the-box PostgreSQL version is 9.2.24.

RHEL 8 was just released and contains PostgreSQL 10. So depending on how
long the IT team takes to integrate a new base OS version into their
standard procedures, that might be the best option.

RHEL also has "software collections" (which are new to me although the
guide is from 2013 - so you can guess how long it's been that I last
administrated RHEL systems) which include PostgreSQL 10 for RHEL 7
according to https://www.postgresql.org/download/linux/redhat/.

Of course, if the IT team is willing to use the PostgreSQL yum
repository, then that is even better.

All this of course assumes that the IT team insists on Redhat, which the
OP hasn't said. If there is a wider varietyl of distributions to choose
from, my preference would be Debian or Ubuntu (in that order).

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment