Thread: RHEL 7 (systemd) reboot
I am running three instances (under different users) on a RHEL 7 server to support a vendor product.
In the defined services, the start & stop scripts work fine when invoked with systemctl {start|stop} whatever.service but we have automated monthly patching which does a reboot.
Looking in /var/log/messages and the stop scripts do not get invoked on reboot, therefore I created a new shutdown service as described here.
It appears that PostGreSQL is receiving a signal from somewhere prior to my script running…
Oct 05 14:18:56 kccontrolmt01 NetworkManager[787]: <info> [1538767136.0967] manager: NetworkManager state is now DISCONNECTED
Oct 05 14:18:56 kccontrolmt01 dbus[740]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispa
Oct 05 14:18:56 kccontrolmt01 dbus[740]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.nm-dispatcher.service': Refusing activation
Oct 05 14:18:56 kccontrolmt01 network[29310]: Shutting down interface eth0: Device 'eth0' successfully disconnected.
Oct 05 14:18:56 kccontrolmt01 network[29310]: [ OK ]
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: ------------------------
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Shutting down CONTROL-M.
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: ------------------------
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Waiting ...
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: psql action failed. cannot perform sql command in /data00/ctmlinux/ctm_server/tmp/upd_CMS_SY
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: db_execute_sql failed while processing /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448.
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Failed to update CMS_SYSPRM table.
Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Be aware that the Configuration Agent might start the CONTROL-M/Server
The database must be available for the product to shut down in a consistent state.
I am open to suggestions.
Thanks,
Bryce
Bryce Pepper
Sr. Unix Applications Systems Engineer
The Kansas City Southern Railway Company
114 West 11th Street | Kansas City, MO 64105
Office: 816.983.1512
Email: bpepper@kcsouthern.com
On 10/9/18 11:06 AM, Bryce Pepper wrote: > I am running three instances (under different users) on a RHEL 7 server > to support a vendor product. > > In the defined services, the start & stop scripts work fine when invoked > with systemctl {start|stop} whatever.service but we have automated > monthly patching which does a reboot. > > Looking in /var/log/messages and the stop scripts do not get invoked on > reboot, therefore I created a new shutdown service as described here > <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>. > > It appears that PostGreSQL is receiving a signal from somewhere prior to > my script running… > > > The database must be available for the product to shut down in a > consistent state. > > I am open to suggestions. What is the below doing or coming from?: db_execute_sql failed while processing /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448. > > Thanks, > > Bryce > > *Bryce Pepper* > > Sr. Unix Applications Systems Engineer > > *The Kansas City Southern Railway Company * > > 114 West 11^th Street | Kansas City, MO 64105 > > Office: 816.983.1512 > > Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com> > -- Adrian Klaver adrian.klaver@aklaver.com
Adrian, Thanks for the inquiry. The function (db_execute_sql) is coming from a vendor (BMC) product called Control-M. It is a schedulingproduct. The tmp file is deleted before I can see its contents but I believe it is trying to update some columns in the CMS_SYSPRMtable. I also think the postgresql instance is already stopped and hence why the db_execute fails. I will try to modify the vendorfunction to save off the contents of the query. Bryce p.s. Do you know of any verbose logging that could be turned on to catch when pgsql is being terminated? -----Original Message----- From: Adrian Klaver <adrian.klaver@aklaver.com> Sent: Tuesday, October 09, 2018 7:39 PM To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org Subject: Re: RHEL 7 (systemd) reboot This email originated from outside the company. Please use caution when opening attachments or clicking on links. If yoususpect this to be a phishing attempt, please report via PhishAlarm. ________________________________ On 10/9/18 11:06 AM, Bryce Pepper wrote: > I am running three instances (under different users) on a RHEL 7 > server to support a vendor product. > > In the defined services, the start & stop scripts work fine when > invoked with systemctl {start|stop} whatever.service but we have > automated monthly patching which does a reboot. > > Looking in /var/log/messages and the stop scripts do not get invoked > on reboot, therefore I created a new shutdown service as described > here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>. > > It appears that PostGreSQL is receiving a signal from somewhere prior > to my script running. > > > The database must be available for the product to shut down in a > consistent state. > > I am open to suggestions. What is the below doing or coming from?: db_execute_sql failed while processing /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448. > > Thanks, > > Bryce > > *Bryce Pepper* > > Sr. Unix Applications Systems Engineer > > *The Kansas City Southern Railway Company * > > 114 West 11^th Street | Kansas City, MO 64105 > > Office: 816.983.1512 > > Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com> > -- Adrian Klaver adrian.klaver@aklaver.com
Here is the contents of the query and error: [root@kccontrolmt01 tmp]# cat ctm.Xf9pQkg2 update CMS_SYSPRM set CURRENT_STATE='STOPPING',DESIRED_STATE='Down' where DESIRED_STATE <> 'Ignored' ; psql: could not connect to server: Connection refused Is the server running on host "kccontrolmt01" (10.1.32.53) and accepting TCP/IP connections on port 5433? -----Original Message----- From: Adrian Klaver <adrian.klaver@aklaver.com> Sent: Tuesday, October 09, 2018 7:39 PM To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org Subject: Re: RHEL 7 (systemd) reboot This email originated from outside the company. Please use caution when opening attachments or clicking on links. If yoususpect this to be a phishing attempt, please report via PhishAlarm. ________________________________ On 10/9/18 11:06 AM, Bryce Pepper wrote: > I am running three instances (under different users) on a RHEL 7 > server to support a vendor product. > > In the defined services, the start & stop scripts work fine when > invoked with systemctl {start|stop} whatever.service but we have > automated monthly patching which does a reboot. > > Looking in /var/log/messages and the stop scripts do not get invoked > on reboot, therefore I created a new shutdown service as described > here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>. > > It appears that PostGreSQL is receiving a signal from somewhere prior > to my script running. > > > The database must be available for the product to shut down in a > consistent state. > > I am open to suggestions. What is the below doing or coming from?: db_execute_sql failed while processing /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448. > > Thanks, > > Bryce > > *Bryce Pepper* > > Sr. Unix Applications Systems Engineer > > *The Kansas City Southern Railway Company * > > 114 West 11^th Street | Kansas City, MO 64105 > > Office: 816.983.1512 > > Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com> > -- Adrian Klaver adrian.klaver@aklaver.com
On 10/10/18 5:32 AM, Bryce Pepper wrote: > Adrian, > Thanks for the inquiry. The function (db_execute_sql) is coming from a vendor (BMC) product called Control-M. It is ascheduling product. > The tmp file is deleted before I can see its contents but I believe it is trying to update some columns in the CMS_SYSPRMtable. > I also think the postgresql instance is already stopped and hence why the db_execute fails. I will try to modify the vendorfunction to save off the contents of the query. Alright, I'm confused. In your earlier post you said the stop script is not running. Yet here it is, just not at the right time. I think a more detailed explanation is needed: 1) The stop script you are concerned about is a systemd script, one that you created or system provided? 2) What is the shutdown service you refer to? 3) Is there a separate shutdown script for the Control-M product? 4) What do you expect to happen vs what is happening? > > Bryce > > p.s. Do you know of any verbose logging that could be turned on to catch when pgsql is being terminated? > > > -----Original Message----- > From: Adrian Klaver <adrian.klaver@aklaver.com> > Sent: Tuesday, October 09, 2018 7:39 PM > To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org > Subject: Re: RHEL 7 (systemd) reboot > > This email originated from outside the company. Please use caution when opening attachments or clicking on links. If yoususpect this to be a phishing attempt, please report via PhishAlarm. > ________________________________ > > On 10/9/18 11:06 AM, Bryce Pepper wrote: >> I am running three instances (under different users) on a RHEL 7 >> server to support a vendor product. >> >> In the defined services, the start & stop scripts work fine when >> invoked with systemctl {start|stop} whatever.service but we have >> automated monthly patching which does a reboot. >> >> Looking in /var/log/messages and the stop scripts do not get invoked >> on reboot, therefore I created a new shutdown service as described >> here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>. >> >> It appears that PostGreSQL is receiving a signal from somewhere prior >> to my script running. >> > >> >> The database must be available for the product to shut down in a >> consistent state. >> >> I am open to suggestions. > > What is the below doing or coming from?: > > db_execute_sql failed while processing > /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448. > >> >> Thanks, >> >> Bryce >> >> *Bryce Pepper* >> >> Sr. Unix Applications Systems Engineer >> >> *The Kansas City Southern Railway Company * >> >> 114 West 11^th Street | Kansas City, MO 64105 >> >> Office: 816.983.1512 >> >> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com> >> > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
Sorry, I wasn't clear in the prior posts. The stop script is running during reboot. The problem is the database is not reachable when the stop script runs. The ctmdistserver shut down is as follows: Stop control-m application Stop control-m configuration agent Stop database As you can see the intent is for the database to be shut down after the product. But as you noticed from /var/log/message the stop_ctmlinux_server.sh script is running but unable to execute the updatequery. I created the following Service definition and scripts that follow -- note there are 2 datacenters (ctmdist, ctmlinux) thathave comparable scripts so I have only included one set: [root@kccontrolmt01 ~]# cat ControlM_Shutdown.service [Unit] Description=Run mycommand at shutdown Requires=network.target CTM_Postgre.service DefaultDependencies=no Before=shutdown.target reboot.target [Service] Type=oneshot RemainAfterExit=true ExecStart=/bin/true ExecStop=/root/scripts/control-m_shutdown.sh [Install] WantedBy=multi-user.target [root@kccontrolmt01 ~]# cat /root/scripts/control-m_shutdown.sh #!/bin/sh # Shutdown any running Control-M services STATUS=$(/usr/bin/systemctl is-active CTMLinux_Server.service) if [ ${STATUS} == "active" ]; then /usr/bin/systemctl stop CTMLinux_Server.service fi STATUS=$(/usr/bin/systemctl is-active CTMDist_Server.service) if [ ${STATUS} == "active" ]; then /usr/bin/systemctl stop CTMDist_Server.service fi STATUS=$(/usr/bin/systemctl is-active EnterpriseManager.service) if [ ${STATUS} == "active" ]; then /usr/bin/systemctl stop EnterpriseManager.service fi exit 0 #!/bin/bash # stop CONTROL-M if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ctm ]; then echo "Stopping CONTROL-M application" /data00/ctmlinux/ctm_server/scripts/shut_ctm fi # stop CONTROL-M Configuration Agent if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ca ]; then echo "Stopping CONTROL-M Server Configuration Agent" /data00/ctmlinux/ctm_server/scripts/shut_ca fi # stop database /data00/ctmlinux/ctm_server/scripts/dbversion if [ $? -ne 0 ] ; then echo "SQL Server is already stopped " else if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then echo "Stopping SQL server for CONTROL-M" /data00/ctmlinux/ctm_server/scripts/shutdb fi fi exit 0 -----Original Message----- From: Adrian Klaver <adrian.klaver@aklaver.com> Sent: Wednesday, October 10, 2018 8:25 AM To: Bryce Pepper <BPepper@KCSouthern.com>; pgsql-general@lists.postgresql.org Subject: Re: RHEL 7 (systemd) reboot This email originated from outside the company. Please use caution when opening attachments or clicking on links. If yoususpect this to be a phishing attempt, please report via PhishAlarm. ________________________________ On 10/10/18 5:32 AM, Bryce Pepper wrote: > Adrian, > Thanks for the inquiry. The function (db_execute_sql) is coming from a vendor (BMC) product called Control-M. It is ascheduling product. > The tmp file is deleted before I can see its contents but I believe it is trying to update some columns in the CMS_SYSPRMtable. > I also think the postgresql instance is already stopped and hence why the db_execute fails. I will try to modify the vendorfunction to save off the contents of the query. Alright, I'm confused. In your earlier post you said the stop script is not running. Yet here it is, just not at the righttime. I think a more detailed explanation is needed: 1) The stop script you are concerned about is a systemd script, one that you created or system provided? 2) What is the shutdown service you refer to? 3) Is there a separate shutdown script for the Control-M product? 4) What do you expect to happen vs what is happening? > > Bryce > > p.s. Do you know of any verbose logging that could be turned on to catch when pgsql is being terminated? > > > -----Original Message----- > From: Adrian Klaver <adrian.klaver@aklaver.com> > Sent: Tuesday, October 09, 2018 7:39 PM > To: Bryce Pepper <BPepper@KCSouthern.com>; > pgsql-general@lists.postgresql.org > Subject: Re: RHEL 7 (systemd) reboot > > This email originated from outside the company. Please use caution when opening attachments or clicking on links. If yoususpect this to be a phishing attempt, please report via PhishAlarm. > ________________________________ > > On 10/9/18 11:06 AM, Bryce Pepper wrote: >> I am running three instances (under different users) on a RHEL 7 >> server to support a vendor product. >> >> In the defined services, the start & stop scripts work fine when >> invoked with systemctl {start|stop} whatever.service but we have >> automated monthly patching which does a reboot. >> >> Looking in /var/log/messages and the stop scripts do not get invoked >> on reboot, therefore I created a new shutdown service as described >> here <https://unix.stackexchange.com/questions/211924/effect-of-reboot-signal-on-systemd-service-state>. >> >> It appears that PostGreSQL is receiving a signal from somewhere prior >> to my script running. >> > >> >> The database must be available for the product to shut down in a >> consistent state. >> >> I am open to suggestions. > > What is the below doing or coming from?: > > db_execute_sql failed while processing > /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_29448. > >> >> Thanks, >> >> Bryce >> >> *Bryce Pepper* >> >> Sr. Unix Applications Systems Engineer >> >> *The Kansas City Southern Railway Company * >> >> 114 West 11^th Street | Kansas City, MO 64105 >> >> Office: 816.983.1512 >> >> Email: bpepper@kcsouthern.com <mailto:bpepper@kcsouthern.com> >> > > > -- > Adrian Klaver > adrian.klaver@aklaver.com > -- Adrian Klaver adrian.klaver@aklaver.com
On 10/10/18 7:37 AM, Bryce Pepper wrote: > Sorry, I wasn't clear in the prior posts. > > The stop script is running during reboot. The problem is the database is not reachable when the stop script runs. Thectmdist server shut down is as follows: > Stop control-m application > Stop control-m configuration agent > Stop database Several things: 1) In your OP there was this: Oct 05 14:18:56 kccontrolmt01 network[29310]: Shutting down interface eth0: Device 'eth0' successfully disconnected. Oct 05 14:18:56 kccontrolmt01 network[29310]: [ OK ] Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: ------------------------ Oct 05 14:18:56 kccontrolmt01 stop_ctmlinux_server.sh[29185]: Shutting down CONTROL-M. So is your Postgres instance running on the same machine as the CTM instance or does the eth0 need to be up to reach the database? 2) In the above there is: "Shutting down CONTROL-M." Yet in script below there is: "Stopping CONTROL-M application" Is this because there are sub-scripts involved or the "Stopping ..." is embedded in the script? 3) I am by no means a shell script expert and I will admit to not fully understanding what control-m_shutdown.sh does. Still here it goes: a) Are there actually two shebangs in one file or are there two files involved? b) What is: # stop database /data00/ctmlinux/ctm_server/scripts/dbversion if [ $? -ne 0 ] ; then echo "SQL Server is already stopped " else if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then echo "Stopping SQL server for CONTROL-M" /data00/ctmlinux/ctm_server/scripts/shutdb fi actually doing? I ask because from what I can see there are a set of parallel processes initiated and it is possible that the database server is winning. It comes down to what 'if [ $? -ne 0 ]' is testing. > > As you can see the intent is for the database to be shut down after the product. > > But as you noticed from /var/log/message the stop_ctmlinux_server.sh script is running but unable to execute the updatequery. > > I created the following Service definition and scripts that follow -- note there are 2 datacenters (ctmdist, ctmlinux)that have comparable scripts so I have only included one set: > > [root@kccontrolmt01 ~]# cat ControlM_Shutdown.service > [Unit] > Description=Run mycommand at shutdown > Requires=network.target CTM_Postgre.service > DefaultDependencies=no > Before=shutdown.target reboot.target > > [Service] > Type=oneshot > RemainAfterExit=true > ExecStart=/bin/true > ExecStop=/root/scripts/control-m_shutdown.sh > > [Install] > WantedBy=multi-user.target > > > [root@kccontrolmt01 ~]# cat /root/scripts/control-m_shutdown.sh > #!/bin/sh > # Shutdown any running Control-M services > STATUS=$(/usr/bin/systemctl is-active CTMLinux_Server.service) > if [ ${STATUS} == "active" ]; then > /usr/bin/systemctl stop CTMLinux_Server.service > fi > > STATUS=$(/usr/bin/systemctl is-active CTMDist_Server.service) > if [ ${STATUS} == "active" ]; then > /usr/bin/systemctl stop CTMDist_Server.service > fi > > STATUS=$(/usr/bin/systemctl is-active EnterpriseManager.service) > if [ ${STATUS} == "active" ]; then > /usr/bin/systemctl stop EnterpriseManager.service > fi > exit 0 > > > #!/bin/bash > > # stop CONTROL-M > if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ctm ]; then > echo "Stopping CONTROL-M application" > /data00/ctmlinux/ctm_server/scripts/shut_ctm > fi > > # stop CONTROL-M Configuration Agent > if [ -f /data00/ctmlinux/ctm_server/scripts/shut_ca ]; then > echo "Stopping CONTROL-M Server Configuration Agent" > /data00/ctmlinux/ctm_server/scripts/shut_ca > fi > > # stop database > /data00/ctmlinux/ctm_server/scripts/dbversion > if [ $? -ne 0 ] ; then > echo "SQL Server is already stopped " > else > if [ -f /data00/ctmlinux/ctm_server/scripts/shutdb ]; then > echo "Stopping SQL server for CONTROL-M" > /data00/ctmlinux/ctm_server/scripts/shutdb > fi > fi > > exit 0 > -- Adrian Klaver adrian.klaver@aklaver.com
Adrian, Thanks for being willing to dig into this. You are correct there are other scripts being called from mine (delivered by BMC with their software). In order to stayin support and work with their updates I use the vendor supplied scripts/programs. The Control-M product is installed on this single server and is broken down into the following parts: Enterprise server with dedicated postgresql instance Distributed datacenter with agent and dedicated postgresql instance Linux datacenter with with agent and dedicated postgresql instance To cut down on the noise, my post only focused on the "Distributed" side and shutdown process -- although the ControlM_Shutdown.serviceunit stop script manages all of the above components. In the ControlM_Shutdown.service there is a requires statement identifying that network must be available while this systemdunit runs. You noticed that the eth0 disconnected in the /var/log/messages. I showed that to highlight that the unit was not executingin the order I had intended, again refer to the requires statement. The second shebang is from one of the invoked subscripts (stop_ctmdist_server.sh) and is the "main" shutdown sequence forthe Distributed datacenter (I think the "SQL server" echo from BMC is because it can be configured with other databasesand they use it in a generic term --- not meaning sqlserver from Microsoft). The dbversion check is being used to verify pgsql instance for this datacenter is running and returns a non-zero return codeif the instance is unreachable (I could use pg_isready or pg_ctl but would diverge further from the BMC supported technique). You probably also noticed in the earlier posted shutdown service a requires of CTM_Postgre.service. This was one of my attemptsto ensure the instance was available by actually starting the instance outside of the BMC routines (if it is alreadyrunning the BMC routines will not start -- the dbversion check is on the start side also). I thought if I managedthe postgresql instance outside of the product I could ensure it was running. Unfortunately that didn't work as theinstance shutdown on its own, presumably a resource (perhaps network) was terminated and postgresql shutdown. So to restate the original post... It appears the postgresql instance is unavailable when the stop script runs. Thanks, Bryce [root@kccontrolmt01 ~]# systemctl --full cat ControlM_Shutdown.service # /etc/systemd/system/ControlM_Shutdown.service [Unit] Description=Run ControlM shutdown process Requires=graphical.target multi-user.target network.target network.service sockets.target DefaultDependencies=no Before=shutdown.target reboot.target halt.target poweroff.target kexec.target [Service] Type=oneshot RemainAfterExit=true ExecStart=/bin/true ExecStop=/bin/bash /root/scripts/control-m_shutdown.sh TimeoutStopSec=4min [Install] WantedBy=multi-user.target [root@kccontrolmt01 ~]#
On 10/11/18 6:33 AM, Bryce Pepper wrote: > Adrian, > > Thanks for being willing to dig into this. > > You are correct there are other scripts being called from mine (delivered by BMC with their software). In order to stayin support and work with their updates I use the vendor supplied scripts/programs. > > The Control-M product is installed on this single server and is broken down into the following parts: > Enterprise server with dedicated postgresql instance > Distributed datacenter with agent and dedicated postgresql instance > Linux datacenter with with agent and dedicated postgresql instance > > To cut down on the noise, my post only focused on the "Distributed" side and shutdown process -- although the ControlM_Shutdown.serviceunit stop script manages all of the above components. > > In the ControlM_Shutdown.service there is a requires statement identifying that network must be available while this systemdunit runs. > > You noticed that the eth0 disconnected in the /var/log/messages. I showed that to highlight that the unit was not executingin the order I had intended, again refer to the requires statement. > > The second shebang is from one of the invoked subscripts (stop_ctmdist_server.sh) and is the "main" shutdown sequence forthe Distributed datacenter (I think the "SQL server" echo from BMC is because it can be configured with other databasesand they use it in a generic term --- not meaning sqlserver from Microsoft). > > The dbversion check is being used to verify pgsql instance for this datacenter is running and returns a non-zero returncode if the instance is unreachable (I could use pg_isready or pg_ctl but would diverge further from the BMC supportedtechnique). > > You probably also noticed in the earlier posted shutdown service a requires of CTM_Postgre.service. This was one of myattempts to ensure the instance was available by actually starting the instance outside of the BMC routines (if it is alreadyrunning the BMC routines will not start -- the dbversion check is on the start side also). I thought if I managedthe postgresql instance outside of the product I could ensure it was running. Unfortunately that didn't work as theinstance shutdown on its own, presumably a resource (perhaps network) was terminated and postgresql shutdown. > > So to restate the original post... It appears the postgresql instance is unavailable when the stop script runs. > > Thanks, > Bryce > > [root@kccontrolmt01 ~]# systemctl --full cat ControlM_Shutdown.service > # /etc/systemd/system/ControlM_Shutdown.service > [Unit] > Description=Run ControlM shutdown process > Requires=graphical.target multi-user.target network.target network.service sockets.target > DefaultDependencies=no > Before=shutdown.target reboot.target halt.target poweroff.target kexec.target Again I am not a systemd expert, but I believe the Before line above is the opposite of what you want: https://serverfault.com/questions/812584/in-systemd-whats-the-difference-between-after-and-requires#812589 Above quotes man page(https://www.freedesktop.org/software/systemd/man/systemd.unit.html): "... Note that when two units with an ordering dependency between them are shut down, the inverse of the start-up order is applied. i.e. if a unit is configured with After= on another unit, the former is stopped before the latter if both are shut down. ..." > > [Service] > Type=oneshot > RemainAfterExit=true > ExecStart=/bin/true > ExecStop=/bin/bash /root/scripts/control-m_shutdown.sh > TimeoutStopSec=4min > > [Install] > WantedBy=multi-user.target > [root@kccontrolmt01 ~]# > -- Adrian Klaver adrian.klaver@aklaver.com
Adrian, I tried changing the Before to After but the postgresql instance was still shutdown too early. I appreciate all of the help but think I'm going to ask the patching group to ensure they stop the control-m services priorto reboot. Bryce Oct 11 09:19:57 kccontrolmt01 su[9816]: pam_unix(su-l:session): session opened for user sa_ctmlinux_uat by (uid=0) Oct 11 09:19:57 kccontrolmt01 systemd[1]: Started Restore /run/initramfs. Oct 11 09:19:57 kccontrolmt01 stop_ctmdist_agent.sh[9671]: setenv: Too many arguments. Oct 11 09:19:57 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: setenv: Too many arguments. Oct 11 09:19:57 kccontrolmt01 stop_ctmdist_agent.sh[9671]: Killing Control-M/Agent Listener pid:5595 Oct 11 09:19:57 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: Killing Control-M/Agent Listener pid:5977 Oct 11 09:19:58 kccontrolmt01 stop_ctmdist_agent.sh[9671]: 2018-10-11 09:19:58 Listener process stopped Oct 11 09:19:58 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: 2018-10-11 09:19:58 Listener process stopped Oct 11 09:19:58 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: Killing Control-M/Agent Tracker pid:6199 Oct 11 09:19:58 kccontrolmt01 stop_ctmdist_agent.sh[9671]: Killing Control-M/Agent Tracker pid:6172 Oct 11 09:19:58 kccontrolmt01 systemd[1]: Stopped Dynamic System Tuning Daemon. Oct 11 09:19:59 kccontrolmt01 stop_ctmlinux_agent.sh[9672]: 2018-10-11 09:19:59 Tracker process stopped Oct 11 09:19:59 kccontrolmt01 stop_ctmdist_agent.sh[9671]: 2018-10-11 09:19:59 Tracker process stopped Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Eracent EUA Service. Oct 11 09:19:59 kccontrolmt01 su[9815]: pam_unix(su-l:session): session closed for user sa_ctmdist_uat Oct 11 09:19:59 kccontrolmt01 su[9816]: pam_unix(su-l:session): session closed for user sa_ctmlinux_uat Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Control-M CTM Dist Agent. Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping Control-M CTM Dist Server... Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Control-M CTM Linux Agent. Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping Control-M CTM Linux Server... Oct 11 09:19:59 kccontrolmt01 su[10319]: (to sa_ctmdist_uat) root on none Oct 11 09:19:59 kccontrolmt01 su[10320]: (to sa_ctmlinux_uat) root on none Oct 11 09:19:59 kccontrolmt01 systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive. Oct 11 09:19:59 kccontrolmt01 systemd-logind[777]: Failed to start session scope session-c12.scope: Transaction is destructive. Oct 11 09:19:59 kccontrolmt01 su[10319]: pam_systemd(su-l:session): Failed to create session: Resource deadlock avoided Oct 11 09:19:59 kccontrolmt01 su[10319]: pam_unix(su-l:session): session opened for user sa_ctmdist_uat by (uid=0) Oct 11 09:19:59 kccontrolmt01 systemd[1]: Requested transaction contradicts existing jobs: Transaction is destructive. Oct 11 09:19:59 kccontrolmt01 systemd-logind[777]: Failed to start session scope session-c13.scope: Transaction is destructive. Oct 11 09:19:59 kccontrolmt01 su[10320]: pam_systemd(su-l:session): Failed to create session: Resource deadlock avoided Oct 11 09:19:59 kccontrolmt01 su[10320]: pam_unix(su-l:session): session opened for user sa_ctmlinux_uat by (uid=0) Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped Eracent EPA Service. Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopped target Network. Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping Network. Oct 11 09:19:59 kccontrolmt01 systemd[1]: Stopping LSB: Bring up/down networking... Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments. Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: SQL Server is not running. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv: Too many arguments. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: SQL Server is not running. Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: ------------------------ Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Shutting down CONTROL-M. Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: ------------------------ Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Waiting ... Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: psql action failed. cannot perform sql command in /data00/ctmdist/ctm_server/tmp/upd_CMS_SYSP Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: db_execute_sql failed while processing /data00/ctmdist/ctm_server/tmp/upd_CMS_SYSPRM_10512.sq Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Failed to update CMS_SYSPRM table. Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Be aware that the Configuration Agent might start the CONTROL-M/Server Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: ------------------------ Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Shutting down CONTROL-M. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: ------------------------ Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Waiting ... Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: psql action failed. cannot perform sql command in /data00/ctmlinux/ctm_server/tmp/upd_CMS_SY Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: db_execute_sql failed while processing /data00/ctmlinux/ctm_server/tmp/upd_CMS_SYSPRM_10571. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Failed to update CMS_SYSPRM table. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Be aware that the Configuration Agent might start the CONTROL-M/Server Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info> [1539267600.3979] device (eth0): state change: activated -> deactivating(reason 'user-requeste Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info> [1539267600.4062] manager: NetworkManager state is now DISCONNECTING Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispa Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.nm-dispatcher.service':Refusing activation Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info> [1539267600.4228] audit: op="device-disconnect" interface="eth0"ifindex=2 pid=10883 uid=0 resu Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info> [1539267600.4240] device (eth0): state change: deactivating ->disconnected (reason 'user-reque Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <warn> [1539267600.4319] platform-linux: do-change-link[2]: failure changinglink: failure 97 (Address Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <warn> [1539267600.4325] device (eth0): failed to enable userspace IPv6LLaddress handling (unspecifie Oct 11 09:20:00 kccontrolmt01 NetworkManager[789]: <info> [1539267600.4509] manager: NetworkManager state is now DISCONNECTED Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispa Oct 11 09:20:00 kccontrolmt01 dbus[748]: [system] Activation via systemd failed for unit 'dbus-org.freedesktop.nm-dispatcher.service':Refusing activation Oct 11 09:20:00 kccontrolmt01 network[10323]: Shutting down interface eth0: Device 'eth0' successfully disconnected.
On 10/11/18 7:53 AM, Bryce Pepper wrote: > Adrian, > > I tried changing the Before to After but the postgresql instance was still shutdown too early. In an earlier post you had: cat ControlM_Shutdown.service [Unit] Description=Run mycommand at shutdown Requires=network.target CTM_Postgre.service Did you add CTM_Postgre.service to After= ? My suspicion being that CTM_Postgre.service is running before you get to ControlM_Shutdown.service. Unless of course CTM_Postgre.service does not exist anymore. Then there is this: Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments. Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: SQL Server is not running. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv: Too many arguments. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: SQL Server is not running. which to me looks like the script is running twice > > I appreciate all of the help but think I'm going to ask the patching group to ensure they stop the control-m services priorto reboot. Yeah, there seems to be hidden dependencies happening. > > Bryce > -- Adrian Klaver adrian.klaver@aklaver.com
I disabled and removed the CTM_Postgre.service as it didn't help (and I didn't want too many moving parts left out there). I did find a post https://superuser.com/questions/1016827/how-do-i-run-a-script-before-everything-else-on-shutdown-with-systemdthat I thinkis getting me closer. I tried RequiresMountsFor=/data00 which starts the script much sooner but unfortunately the postgresql instance isunreachable by the time the script gets there. These are two unique datacenter shutdowns: ctmdist & ctmlinux Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments. Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01stop_ctmdist_server.sh[10316]: SQL Server is not running. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv: Too many arguments. Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01stop_ctmlinux_server.sh[10318]: SQL Server is not running.
On 10/11/18 10:43 AM, Bryce Pepper wrote: > I disabled and removed the CTM_Postgre.service as it didn't help (and I didn't want too many moving parts left out there). > > I did find a post https://superuser.com/questions/1016827/how-do-i-run-a-script-before-everything-else-on-shutdown-with-systemdthat I thinkis getting me closer. > > I tried RequiresMountsFor=/data00 which starts the script much sooner but unfortunately the postgresql instanceis unreachable by the time the script gets there. Seems to me the first priority is finding what is shutting down Postgres. Does the system log show anything? If not, find the shutdown time in the Postgres log and correlate that with the system log. > > These are two unique datacenter shutdowns: ctmdist & ctmlinux > > Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: setenv: Too many arguments. > Oct 11 09:20:00 kccontrolmt01 stop_ctmdist_server.sh[10316]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01stop_ctmdist_server.sh[10316]: SQL Server is not running. > Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: setenv: > Too many arguments. > Oct 11 09:20:00 kccontrolmt01 stop_ctmlinux_server.sh[10318]: Stopping CONTROL-M application Oct 11 09:20:00 kccontrolmt01stop_ctmlinux_server.sh[10318]: SQL Server is not running. > -- Adrian Klaver adrian.klaver@aklaver.com