Re: postgresql 9.4 restart - Mailing list pgsql-general

From Adrian Klaver
Subject Re: postgresql 9.4 restart
Date
Msg-id 86e74bbc-a495-d15c-fea8-4877b83b1c3d@aklaver.com
Whole thread Raw
In response to Re: postgresql 9.4 restart  (Julie Nishimura <juliezain@hotmail.com>)
Responses Re: postgresql 9.4 restart
List pgsql-general
On 5/8/19 4:35 PM, Julie Nishimura wrote:
> Thanks again for your help.
> 
> 1) max_connections is set to 500 in our config file
> 2) our average conn load is 300-400, however, I am not sure if it was 
> close to max_conn at the moment of restart
> 3) we do not have replication on this staging server
> 4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.

> 
> /var/message/syslog:
> 
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD 
> (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin 
> +480 -exec bzip2 {} \;)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD 
> (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (   
> cd / && run-parts --report /etc/cron.hourly)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST 
> (root)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST 
> (postgres)
> May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]: 
> (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure 
> changed 'purged' to 'present'
> May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished 
> catalog run in 106.68 seconds
> May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD 
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> 
> this is from auth.log:
> 
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: 
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: 
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: 
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: 
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection 
> closed by ::1 port 49936 [preauth]
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: 
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: 
> pam_unix(cron:session): session closed for user postgres
> May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection 
> closed by ::1 port 49942 [preauth]
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection 
> closed by ::1 port 49948 [preauth]
> May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection 
> closed by ::1 port 49954 [preauth]
> May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection 
> closed by ::1 port 49960 [preauth]
> May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection 
> closed by ::1 port 49966 [preauth]
> May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection 
> closed by ::1 port 49972 [preauth]
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection 
> closed by ::1 port 49978 [preauth]
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection 
> closed by ::1 port 49984 [preauth]
> May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection 
> closed by ::1 port 49990 [preauth]
> May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection 
> closed by ::1 port 49996 [preauth]
> May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection 
> closed by ::1 port 50002 [preauth]
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: 
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: 
> pam_unix(cron:session): session closed for user root
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown 
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P 
> format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM 
> pg_database WHERE datistemplate = false AND datname != 'postgres'
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session opened for user postgres by (uid=0)
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session closed for user postgres
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown 
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P 
> format=unaligned -c SHOW server_version
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session opened for user postgres by (uid=0)
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): 
> session closed for user postgres
> May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection 
> closed by ::1 port 50018 [preauth]
> 
> 
> For me, it seems like there is some cron job ran shortly after midnight 
> (either from root or through puppet and restarted the server, because 
> maybe it was close to max connections - just a guess). What do you think?
> 
> -Julie N
> 
> 
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Wednesday, May 8, 2019 2:17 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 2:01 PM, Julie Nishimura wrote:
>> Adrian, thanks for your reply!
>> 
>> 
>> 
>> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1), 
>> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
>> 
>> This is what Ive found in logs prior to the restart:
> 
>> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
>> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are 
>> reserved for non-replication superuser connections",,,,,,,,,""
>> 2019-05-07 00:08:26.778 
> 
> The above is not good.
> 
> What do you have set for max_connections?
> 
> How many connections are you trying against the database?
> 
> Do you have replication set up to other Postgres instances(uk, au, us)?
> 
> What does the system log show from the same time period?
> 
> 
> 
>> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021 
>> user=hitwise database=hitwise_us_stg host=10.200.193.58 
>> port=49424",,,,,,,,,""
>> 2019-05-07 00:08:26.779 
>> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018 
>> user=hitwise database=hitwise_us_stg host=10.200.193.58 
>> port=49426",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17 
>> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
>> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
>> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
>> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
>> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
>> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
>> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
>> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to 
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
>> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339 
>> user=hitwise database=hitwise_uk_stg host=10.200.193.58 
>> port=49246",,,,,,,,,""
>> 2019-05-07 00:08:28.060 
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
>> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215 
>> user=hitwise database=hitwise_au_stg host=10.200.250.49 
>> port=31100",,,,,,,,,""
>> 
>> Does it help?
>> 
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
>> *Sent:* Wednesday, May 8, 2019 12:07 PM
>> *To:* Julie Nishimura
>> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
>> *Subject:* Re: postgresql 9.4 restart
>> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>>> Hello,
>>> Our staging 9.4 postgresql has been running couple of weeks with no 
>>> problem, but yesterday we saw this error from one of our services that 
>>> connects to the staging DBs:
>>> 
>>> FATAL: terminating connection due to
>>> administrator command; nested exception is
>>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>>> administrator command”,
>> 
>> Distro and version?
>> 
>> Well it originated in Java code if that helps?
>> 
>> Anything in the Postgres logs just prior to 'FATAL: terminating
>> connection ...' that might be relevant?
>> 
>> Do the system logs showing anything relevant e.g. the OOM killer
>> stopping Postgres?
>> 
>> 
>> 
>>> 
>>> Then I could see that postgresql got restarted, based on the output of 
>>> "select pg_postmaster_start_time()"
>>> 
>>> We are running postgresql on
>>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 
>>> x86_64 x86_64 GNU/Linux
>>> 
>>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>>> 
>>> -Julie N
>>> 
>>> 
>> 
>> 
>> -- 
>> Adrian Klaver
>> adrian.klaver@aklaver.com
> 
> 
> -- 
> Adrian Klaver
> adrian.klaver@aklaver.com


-- 
Adrian Klaver
adrian.klaver@aklaver.com



pgsql-general by date:

Previous
From: David Steele
Date:
Subject: Re: perl path issue
Next
From: Prakash Ramakrishnan
Date:
Subject: Re: perl path issue