Re: postgresql 9.4 restart - Mailing list pgsql-general

From Julie Nishimura
Subject Re: postgresql 9.4 restart
Date
Msg-id DM6PR08MB5018792362D5EB94328CF167AC320@DM6PR08MB5018.namprd08.prod.outlook.com
Whole thread Raw
In response to Re: postgresql 9.4 restart  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: postgresql 9.4 restart
List pgsql-general
Thanks again for your help.

1) max_connections is set to 500 in our config file
2) our average conn load is 300-400, however, I am not sure if it was close to max_conn at the moment of restart
3) we do not have replication on this staging server
4) These are the snippets from auth logs with the corresponding time:

/var/message/syslog:

May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin +480 -exec bzip2 {} \;)
May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)
May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST (root)
May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST (postgres)
May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]: (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure changed 'purged' to 'present'
May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished catalog run in 106.68 seconds
May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

this is from auth.log:

May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session closed for user postgres
May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session closed for user postgres
May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection closed by ::1 port 49936 [preauth]
May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session opened for user postgres by (uid=0)
May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session closed for user postgres
May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection closed by ::1 port 49942 [preauth]
May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session closed for user root
May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection closed by ::1 port 49948 [preauth]
May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection closed by ::1 port 49954 [preauth]
May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection closed by ::1 port 49960 [preauth]
May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection closed by ::1 port 49966 [preauth]
May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection closed by ::1 port 49972 [preauth]
May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session closed for user root
May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection closed by ::1 port 49978 [preauth]
May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session closed for user root
May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection closed by ::1 port 49984 [preauth]
May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection closed by ::1 port 49990 [preauth]
May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection closed by ::1 port 49996 [preauth]
May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection closed by ::1 port 50002 [preauth]
May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session opened for user root by (uid=0)
May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session closed for user root
May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM pg_database WHERE datistemplate = false AND datname != 'postgres'
May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0)
May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres
May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SHOW server_version
May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0)
May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres
May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection closed by ::1 port 50018 [preauth]


For me, it seems like there is some cron job ran shortly after midnight (either from root or through puppet and restarted the server, because maybe it was close to max connections - just a guess). What do you think?

-Julie N



From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Wednesday, May 8, 2019 2:17 PM
To: Julie Nishimura
Cc: pgsql-general@lists.postgresql.org; pgsql-general
Subject: Re: postgresql 9.4 restart
 
On 5/8/19 2:01 PM, Julie Nishimura wrote:
> Adrian, thanks for your reply!
>
>
>
> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
>
> This is what Ive found in logs prior to the restart:

> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
> reserved for non-replication superuser connections",,,,,,,,,""
> 2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?



> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
> user=hitwise database=hitwise_us_stg host=10.200.193.58
> port=49424",,,,,,,,,""
> 2019-05-07 00:08:26.779
> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
> user=hitwise database=hitwise_us_stg host=10.200.193.58
> port=49426",,,,,,,,,""
> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
> administrator command",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
> user=hitwise database=hitwise_uk_stg host=10.200.193.58
> port=49246",,,,,,,,,""
> 2019-05-07 00:08:28.060
> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
> user=hitwise database=hitwise_au_stg host=10.200.250.49
> port=31100",,,,,,,,,""
>
> Does it help?
>
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@aklaver.com>
> *Sent:* Wednesday, May 8, 2019 12:07 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@lists.postgresql.org; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>> Hello,
>> Our staging 9.4 postgresql has been running couple of weeks with no
>> problem, but yesterday we saw this error from one of our services that
>> connects to the staging DBs:
>>
>> FATAL: terminating connection due to
>> administrator command; nested exception is
>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>> administrator command”,
>
> Distro and version?
>
> Well it originated in Java code if that helps?
>
> Anything in the Postgres logs just prior to 'FATAL: terminating
> connection ...' that might be relevant?
>
> Do the system logs showing anything relevant e.g. the OOM killer
> stopping Postgres?
>
>
>
>>
>> Then I could see that postgresql got restarted, based on the output of
>> "select pg_postmaster_start_time()"
>>
>> We are running postgresql on
>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
>> x86_64 x86_64 GNU/Linux
>>
>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>>
>> -Julie N
>>
>>
>
>
> --
> Adrian Klaver
> adrian.klaver@aklaver.com


--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: Ron
Date:
Subject: Re: Oracle Migration Approach (Open source vs Vendor Specific)
Next
From: Igal Sapir
Date:
Subject: Re: Postgres for SQL Server users