Thread: Postgre Fails To Start in Multi-User Mode

Postgre Fails To Start in Multi-User Mode

From
Peter Brady
Date:
Hi All,

I'm relatively new to postgres after inheriting this server from a previous admin so please bear with me if these are obvious questions.

My scenario:
  • Last weekend I had a scheduled maintenance window for power/air conditioning work.
  • Prior to this outage the server was running fine.
  • All servers were shutdown cleanly prior to the outage.  As far as I can tell.
  • My postgre service failed to come back after the outage.
  • I have no logs to indicate an error.
  • I'm running:
    • postgresql-server 9.2.13
    • CentOS 6.6
  • I have backups and a development machine that I can roll to in the short term but I'd really like to get this sorted out.

The list of what I've tried so far is:

  1. Disk space on the data partition
  2. Start in single user mode in the foreground
  3. Start in multi user mode in the foreground

Details of these are outlined below.

Disk Space
Google seems to suggest that the main suspect in this area is disk space.  df suggests that this is not the case:

df -hl /var/lib/pgsql/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdc1       917G  519G  353G  60% /var/lib/pgsql

There are, however, some quite large tables and indices so I can't definitively rule this out.

Single User Mode
I can start the server in single user mode

-bash-4.1$ /usr/pgsql-9.2/bin/postmaster --single -p 5432 -D /var/lib/pgsql/9.2/data -d 5
DEBUG:  invoking IpcMemoryCreate(size=6612303872)
DEBUG:  SlruScanDirectory invoking callback on pg_notify/0000
DEBUG:  removing file "pg_notify/0000"
DEBUG:  InitPostgres
DEBUG:  my backend ID is 1
LOG:  database system was shut down at 2015-12-14 10:35:37 AEDT
DEBUG:  checkpoint record is at 3ED/7EC8CA68
DEBUG:  redo record is at 3ED/7EC8CA68; shutdown TRUE
DEBUG:  next transaction ID: 0/343785580; next OID: 136370
DEBUG:  next MultiXactId: 441782; next MultiXactOffset: 956443
DEBUG:  oldest unfrozen transaction ID: 150016627, in database 12870
DEBUG:  transaction ID wrap limit is 2297500274, limited by database with OID 12870
DEBUG:  StartTransaction
DEBUG:  name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
DEBUG:  CommitTransaction
DEBUG:  name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:

PostgreSQL stand-alone backend 9.2.13
backend> [CTRL-D] DEBUG:  shmem_exit(0): 11 callbacks to make
LOG:  shutting down
DEBUG:  SlruScanDirectory invoking callback on pg_multixact/offsets/0006
DEBUG:  SlruScanDirectory invoking callback on pg_multixact/members/000E
DEBUG:  attempting to remove WAL segments older than log file 00000001000003ED0000007D
DEBUG:  SlruScanDirectory invoking callback on pg_subtrans/147D
LOG:  database system is shut down
DEBUG:  proc_exit(0): 3 callbacks to make
DEBUG:  exit(0)
DEBUG:  shmem_exit(-1): 0 callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make

There does not appear to be any errors there.

Multi-User Mode
I cannot start in multi-user mode:

-bash-4.1$ /usr/pgsql-9.2/bin/postmaster -p 5432 -D /var/lib/pgsql/9.2/data -d 5
DEBUG:  postmaster: PostmasterMain: initial environment dump:
DEBUG:  -----------------------------------------
DEBUG:      HOSTNAME=syd-pgsql-00.wmawater.com.au
DEBUG:      SHELL=/bin/bash
DEBUG:      TERM=xterm-256color
DEBUG:      HISTSIZE=1000
DEBUG:      QTDIR=/usr/lib64/qt-3.3
DEBUG:      QTINC=/usr/lib64/qt-3.3/include
DEBUG:      USER=postgres
<snip DEBUG:      LS_COLORS>
DEBUG:      MAIL=/var/spool/mail/postgres
DEBUG:      PATH=/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin
DEBUG:      PWD=/var/lib/pgsql
DEBUG:      LANG=en_US.UTF-8
DEBUG:      HISTCONTROL=ignoredups
DEBUG:      SHLVL=1
DEBUG:      HOME=/var/lib/pgsql
DEBUG:      LOGNAME=postgres
DEBUG:      QTLIB=/usr/lib64/qt-3.3/lib
DEBUG:      PGDATA=/var/lib/pgsql/9.2/data
DEBUG:      LESSOPEN=||/usr/bin/lesspipe.sh %s
DEBUG:      G_BROKEN_FILENAMES=1
DEBUG:      _=/usr/pgsql-9.2/bin/postmaster
DEBUG:      PGLOCALEDIR=/usr/pgsql-9.2/share/locale
DEBUG:      PGSYSCONFDIR=/etc/sysconfig/pgsql
DEBUG:      LC_COLLATE=en_US.UTF-8
DEBUG:      LC_CTYPE=en_US.UTF-8
DEBUG:      LC_MESSAGES=en_US.UTF-8
DEBUG:      LC_MONETARY=C
DEBUG:      LC_NUMERIC=C
DEBUG:      LC_TIME=C
DEBUG:  -----------------------------------------
DEBUG:  invoking IpcMemoryCreate(size=6612303872)
DEBUG:  SlruScanDirectory invoking callback on pg_notify/0000
DEBUG:  removing file "pg_notify/0000"
DEBUG:  max_safe_fds = 984, usable_fds = 1000, already_open = 6
DEBUG:  logger shutting down
DEBUG:  shmem_exit(0): 0 callbacks to make
DEBUG:  proc_exit(0): 0 callbacks to make
DEBUG:  exit(0)
DEBUG:  shmem_exit(-1): 0 callbacks to make
DEBUG:  proc_exit(-1): 0 callbacks to make

Again there appears to be nothing logged to indicate why the server is not starting at this point.

I would be very grateful for any suggestions at this point.

Cheers,
-pete

Attachment

Re: Postgre Fails To Start in Multi-User Mode

From
John R Pierce
Date:
On 12/13/2015 4:22 PM, Peter Brady wrote:
> Again there appears to be nothing logged to indicate why the server is
> not starting at this point.

the standard versions of postgres for RHEL/CentOS leave two sets of
logs... 1 is the startup log /var/lib/pgsql/9.2/pgstartup.log, and the
other is the regular PG logging, in /var/lib/pgsql/9.2/data/pg_log/* ...

I'd run the /normal/  pg startup script as their may be site specific
environments configured by the original administrator...

     #  service postgresql-9.2 start

and then look at the startup log first.   if it ends with...

2015-07-21 00:33:31.851 PDT @[]: LOG:  redirecting log output to logging
collector process
2015-07-21 00:33:31.851 PDT @[]: HINT:  Future log output will appear in
directory "pg_log"

then look in the data/pg_log directory for a new file dated today.



--
john r pierce, recycling bits in santa cruz



Re: [SOLVED] Postgre Fails To Start in Multi-User Mode

From
Peter Brady
Date:
On 14/12/2015 11:39 AM, John R Pierce wrote:
On 12/13/2015 4:22 PM, Peter Brady wrote:
Again there appears to be nothing logged to indicate why the server is not starting at this point.

the standard versions of postgres for RHEL/CentOS leave two sets of logs... 1 is the startup log /var/lib/pgsql/9.2/pgstartup.log, and the other is the regular PG logging, in /var/lib/pgsql/9.2/data/pg_log/* ...

I'd run the /normal/  pg startup script as their may be site specific environments configured by the original administrator...

    #  service postgresql-9.2 start

and then look at the startup log first.   if it ends with...

2015-07-21 00:33:31.851 PDT @[]: LOG:  redirecting log output to logging collector process
2015-07-21 00:33:31.851 PDT @[]: HINT:  Future log output will appear in directory "pg_log"

then look in the data/pg_log directory for a new file dated today.
Hi John,

Thanks for the heads up on that second log.  I was focusing on:

/var/lib/pgsql/9.2/pgstartup.log

as that's what I found in the init.d script.  It however was empty.

The key was in the second log in:

/var/lib/pgsql/9.2/data/pg_log/*

which pointed to a typo in pg_hba for host ip specification.  Clearly the typo had not been through a restart yet and when I restarted the machine it hit the bug.  That was then a simple fix as the IP address was missing its CDIR mask.

Cheers,
-pete
Attachment