Thread: Debugging Failed Startup

Debugging Failed Startup

From
Ray Cote
Date:
Does anyone have a hint on how I'd go about debugging why PostgreSQL 11 is not starting on CentOS 7? 
Was running fine for several weeks then fails to come up after a reboot. 

I did change the name of the CentOS7 server, but I don't expect that to have any effect on the database. 

I'm running debug5 to the logs. 

There's nothing in the ./log directory past the database shutting down.
Here's what's in messages. All seems to be fine and then boom!

... postmaster: LOG:  listening on IPv6 address "::1", port 5432
... postmaster: LOG:  listening on IPv4 address "127.0.0.1", port 5432
... postmaster: LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
... postmaster: LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
... postmaster: DEBUG:  invoking IpcMemoryCreate(size=148553728)
... postmaster: DEBUG:  SlruScanDirectory invoking callback on pg_notify/0000
... postmaster: DEBUG:  removing file "pg_notify/0000"
... postmaster: DEBUG:  dynamic shared memory system will support 288 segments
... postmaster: DEBUG:  created dynamic shared memory control segment 1998722972 (6928 bytes)
... postmaster: DEBUG:  max_safe_fds = 983, usable_fds = 1000, already_open = 7
... postmaster: LOG:  redirecting log output to logging collector process
... postmaster: HINT:  Future log output will appear in directory "log".
... systemd: postgresql-11.service: main process exited, code=exited, status=1/FAILURE
... systemd: Failed to start PostgreSQL 11 database server.
... systemd: Unit postgresql-11.service entered failed state.

--Ray

Re: Debugging Failed Startup

From
Adrian Klaver
Date:
On 5/2/19 11:22 AM, Ray Cote wrote:
> Does anyone have a hint on how I'd go about debugging why PostgreSQL 11 
> is not starting on CentOS 7?
> Was running fine for several weeks then fails to come up after a reboot.
> 
> I did change the name of the CentOS7 server, but I don't expect that to 
> have any effect on the database.
> 
> I'm running debug5 to the logs.

The log below is coming from your syslog?

Is there more in the syslog?

> 
> There's nothing in the ./log directory past the database shutting down.
> Here's what's in messages. All seems to be fine and then boom!
> 
> ... postmaster: LOG:  listening on IPv6 address "::1", port 5432
> ... postmaster: LOG:  listening on IPv4 address "127.0.0.1", port 5432
> ... postmaster: LOG:  listening on Unix socket 
> "/var/run/postgresql/.s.PGSQL.5432"
> ... postmaster: LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
> ... postmaster: DEBUG:  invoking IpcMemoryCreate(size=148553728)
> ... postmaster: DEBUG:  SlruScanDirectory invoking callback on 
> pg_notify/0000
> ... postmaster: DEBUG:  removing file "pg_notify/0000"
> ... postmaster: DEBUG:  dynamic shared memory system will support 288 
> segments
> ... postmaster: DEBUG:  created dynamic shared memory control segment 
> 1998722972 (6928 bytes)
> ... postmaster: DEBUG:  max_safe_fds = 983, usable_fds = 1000, 
> already_open = 7
> ... postmaster: LOG:  redirecting log output to logging collector process
> ... postmaster: HINT:  Future log output will appear in directory "log".
> ... systemd: postgresql-11.service: main process exited, code=exited, 
> status=1/FAILURE
> ... systemd: Failed to start PostgreSQL 11 database server.
> ... systemd: Unit postgresql-11.service entered failed state.
> 
> --Ray


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Debugging Failed Startup

From
Ray Cote
Date:


On Thu, May 2, 2019 at 2:57 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 5/2/19 11:22 AM, Ray Cote wrote:
> Does anyone have a hint on how I'd go about debugging why PostgreSQL 11
> is not starting on CentOS 7?
> Was running fine for several weeks then fails to come up after a reboot.
>
> I did change the name of the CentOS7 server, but I don't expect that to
> have any effect on the database.
>
> I'm running debug5 to the logs.

The log below is coming from your syslog?

Is there more in the syslog?
Yes, that's from the syslog (messages) trimmed a little. There's nothing further in the log. 
Added the debug messages that were prior to that below.

Ended up uninstalling and installing and all's fine again.  Comes back after a reboot. 
>
> There's nothing in the ./log directory past the database shutting down.
> Here's what's in messages. All seems to be fine and then boom!
>
... systemd: Starting PostgreSQL 11 database server...
... postmaster: ... DEBUG:  postmaster: PostmasterMain: initial environment dump:
... postmaster: ... DEBUG:  -----------------------------------------
... postmaster: ... DEBUG:  #011LANG=en_US.UTF-8
... postmaster: ... DEBUG:  #011PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
... postmaster: ... DEBUG:  #011NOTIFY_SOCKET=/run/systemd/notify
... postmaster: ... DEBUG:  #011HOME=/var/lib/pgsql
... postmaster: ... DEBUG:  #011LOGNAME=postgres
... postmaster: ... DEBUG:  #011USER=postgres
... postmaster: ... DEBUG:  #011SHELL=/bin/bash
... postmaster: ... DEBUG:  #011PGDATA=/var/lib/pgsql/11/data/
... postmaster: ... DEBUG:  #011PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
... postmaster: ... DEBUG:  #011PG_OOM_ADJUST_VALUE=0
... postmaster: ... DEBUG:  #011PGLOCALEDIR=/usr/pgsql-11/share/locale
... postmaster: ... DEBUG:  #011PGSYSCONFDIR=/etc/sysconfig/pgsql
... postmaster: ... DEBUG:  #011LC_COLLATE=en_US.UTF-8
... postmaster: ... DEBUG:  #011LC_CTYPE=en_US.UTF-8
... postmaster: ... DEBUG:  #011LC_MESSAGES=en_US.UTF-8
... postmaster: ... DEBUG:  #011LC_MONETARY=C
... postmaster: ... DEBUG:  #011LC_NUMERIC=C
... postmaster: ... DEBUG:  #011LC_TIME=C
... postmaster: ... DEBUG:  -----------------------------------------
... postmaster: ... DEBUG:  registering background worker "logical replication launcher"
> ... postmaster: LOG:  listening on IPv6 address "::1", port 5432
> ... postmaster: LOG:  listening on IPv4 address "127.0.0.1", port 5432
> ... postmaster: LOG:  listening on Unix socket
> "/var/run/postgresql/.s.PGSQL.5432"
> ... postmaster: LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
> ... postmaster: DEBUG:  invoking IpcMemoryCreate(size=148553728)
> ... postmaster: DEBUG:  SlruScanDirectory invoking callback on
> pg_notify/0000
> ... postmaster: DEBUG:  removing file "pg_notify/0000"
> ... postmaster: DEBUG:  dynamic shared memory system will support 288
> segments
> ... postmaster: DEBUG:  created dynamic shared memory control segment
> 1998722972 (6928 bytes)
> ... postmaster: DEBUG:  max_safe_fds = 983, usable_fds = 1000,
> already_open = 7
> ... postmaster: LOG:  redirecting log output to logging collector process
> ... postmaster: HINT:  Future log output will appear in directory "log".
> ... systemd: postgresql-11.service: main process exited, code=exited,
> status=1/FAILURE
> ... systemd: Failed to start PostgreSQL 11 database server.
> ... systemd: Unit postgresql-11.service entered failed state.
>
> --Ray


--
Adrian Klaver
adrian.klaver@aklaver.com


--
Raymond Cote, President
voice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote


Re: Debugging Failed Startup

From
Ray Cote
Date:
Spoke too soon, Not coming back after a boot. Seems I can install, run, but not reboot!

On Thu, May 2, 2019 at 3:18 PM Ray Cote <rgacote@appropriatesolutions.com> wrote:


On Thu, May 2, 2019 at 2:57 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 5/2/19 11:22 AM, Ray Cote wrote:
> Does anyone have a hint on how I'd go about debugging why PostgreSQL 11
> is not starting on CentOS 7?
> Was running fine for several weeks then fails to come up after a reboot.
>
> I did change the name of the CentOS7 server, but I don't expect that to
> have any effect on the database.
>
> I'm running debug5 to the logs.

The log below is coming from your syslog?

Is there more in the syslog?
Yes, that's from the syslog (messages) trimmed a little. There's nothing further in the log. 
Added the debug messages that were prior to that below.

Ended up uninstalling and installing and all's fine again.  Comes back after a reboot. 
>
> There's nothing in the ./log directory past the database shutting down.
> Here's what's in messages. All seems to be fine and then boom!
>
... systemd: Starting PostgreSQL 11 database server...
... postmaster: ... DEBUG:  postmaster: PostmasterMain: initial environment dump:
... postmaster: ... DEBUG:  -----------------------------------------
... postmaster: ... DEBUG:  #011LANG=en_US.UTF-8
... postmaster: ... DEBUG:  #011PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
... postmaster: ... DEBUG:  #011NOTIFY_SOCKET=/run/systemd/notify
... postmaster: ... DEBUG:  #011HOME=/var/lib/pgsql
... postmaster: ... DEBUG:  #011LOGNAME=postgres
... postmaster: ... DEBUG:  #011USER=postgres
... postmaster: ... DEBUG:  #011SHELL=/bin/bash
... postmaster: ... DEBUG:  #011PGDATA=/var/lib/pgsql/11/data/
... postmaster: ... DEBUG:  #011PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
... postmaster: ... DEBUG:  #011PG_OOM_ADJUST_VALUE=0
... postmaster: ... DEBUG:  #011PGLOCALEDIR=/usr/pgsql-11/share/locale
... postmaster: ... DEBUG:  #011PGSYSCONFDIR=/etc/sysconfig/pgsql
... postmaster: ... DEBUG:  #011LC_COLLATE=en_US.UTF-8
... postmaster: ... DEBUG:  #011LC_CTYPE=en_US.UTF-8
... postmaster: ... DEBUG:  #011LC_MESSAGES=en_US.UTF-8
... postmaster: ... DEBUG:  #011LC_MONETARY=C
... postmaster: ... DEBUG:  #011LC_NUMERIC=C
... postmaster: ... DEBUG:  #011LC_TIME=C
... postmaster: ... DEBUG:  -----------------------------------------
... postmaster: ... DEBUG:  registering background worker "logical replication launcher"
> ... postmaster: LOG:  listening on IPv6 address "::1", port 5432
> ... postmaster: LOG:  listening on IPv4 address "127.0.0.1", port 5432
> ... postmaster: LOG:  listening on Unix socket
> "/var/run/postgresql/.s.PGSQL.5432"
> ... postmaster: LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
> ... postmaster: DEBUG:  invoking IpcMemoryCreate(size=148553728)
> ... postmaster: DEBUG:  SlruScanDirectory invoking callback on
> pg_notify/0000
> ... postmaster: DEBUG:  removing file "pg_notify/0000"
> ... postmaster: DEBUG:  dynamic shared memory system will support 288
> segments
> ... postmaster: DEBUG:  created dynamic shared memory control segment
> 1998722972 (6928 bytes)
> ... postmaster: DEBUG:  max_safe_fds = 983, usable_fds = 1000,
> already_open = 7
> ... postmaster: LOG:  redirecting log output to logging collector process
> ... postmaster: HINT:  Future log output will appear in directory "log".
> ... systemd: postgresql-11.service: main process exited, code=exited,
> status=1/FAILURE
> ... systemd: Failed to start PostgreSQL 11 database server.
> ... systemd: Unit postgresql-11.service entered failed state.
>
> --Ray


--
Adrian Klaver
adrian.klaver@aklaver.com


--
Raymond Cote, President
voice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote




--
Raymond Cote, President
voice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote


Re: Debugging Failed Startup

From
Adrian Klaver
Date:
On 5/2/19 12:20 PM, Ray Cote wrote:
> Spoke too soon, Not coming back after a boot. Seems I can install, run, 
> but not reboot!

What happens if change the server name back?

Might want to take a look at what is in postgresql-11.service also.

> 
> On Thu, May 2, 2019 at 3:18 PM Ray Cote 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Debugging Failed Startup

From
Alvaro Herrera
Date:
On 2019-May-02, Ray Cote wrote:

> Does anyone have a hint on how I'd go about debugging why PostgreSQL 11 is
> not starting on CentOS 7?
> Was running fine for several weeks then fails to come up after a reboot.

How are you getting it started after the reboot?  If you're not using
systemd facilities (eg. if you're calling pg_ctl on your own script),
systemd may be getting confused about what's the status of the service
and stopping it on its own.

Also, what are the timestamp differences between the final "redirecting
log output" line and the "Failed to start" lines?


-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Debugging Failed Startup

From
Ray Cote
Date:


On Thu, May 2, 2019 at 4:12 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2019-May-02, Ray Cote wrote:

> Does anyone have a hint on how I'd go about debugging why PostgreSQL 11 is
> not starting on CentOS 7?
> Was running fine for several weeks then fails to come up after a reboot.

How are you getting it started after the reboot?  If you're not using
systemd facilities (eg. if you're calling pg_ctl on your own script),
systemd may be getting confused about what's the status of the service
and stopping it on its own.
systemctl start postgresql-11
Plus I've run systemctl enable postgresql-11

 
Also, what are the timestamp differences between the final "redirecting
log output" line and the "Failed to start" lines?
0 second difference. 

Not sure what's happend. Re-installed a second time. Now it has survived pass three reboots and seems happy. Was just odd that there wasn't any error message as to why it failed.

As an aside to the person asking about changing back the hostname. 
I re-installed using the new hostname and it still failed once. 

Thanks for everyone's feedback. 
Seems to be happy again.
--Ray
 


--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Raymond Cote, President
voice: +1.603.924.6079 email: rgacote@AppropriateSolutions.com skype: ray.cote