Re: psql: FATAL: the database system is starting up - Mailing list pgsql-general

From Tom K
Subject Re: psql: FATAL: the database system is starting up
Date
Msg-id CAE3EmBCbbw-hf7uRca3MvVKogshiLUpOod_9XmX4iF2jKJxugw@mail.gmail.com
Whole thread Raw
In response to Re: psql: FATAL: the database system is starting up  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: psql: FATAL: the database system is starting up
Re: psql: FATAL: the database system is starting up
List pgsql-general


On Sat, Jun 1, 2019 at 7:12 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 6/1/19 3:56 PM, Tom K wrote:
>
>

>
> postgres=# select oid, datname from pg_database ;
>    oid  |  datname
> -------+-----------
>   13806 | postgres
>       1 | template1
>   13805 | template0
> (3 rows)
>

So there are only the system databases available

> -bash-4.2$ cd /data/patroni/
> -bash-4.2$ ls -altri
> total 144
>   69085037 drwxr-xr-x.  3 root     root        20 Oct 23  2018 ..
> 135316997 -rw-------.  1 postgres postgres   206 Oct 29  2018
> backup_label.old
> 201708781 drwx------.  2 postgres postgres     6 Oct 29  2018 pg_commit_ts
>    1502746 drwx------.  2 postgres postgres     6 Oct 29  2018 pg_dynshmem
>   68994449 drwx------.  2 postgres postgres     6 Oct 29  2018 pg_twophase
>    1502749 drwx------.  2 postgres postgres     6 Oct 29  2018 pg_snapshots
> 201708785 drwx------.  2 postgres postgres     6 Oct 29  2018 pg_serial
>    1502747 drwx------.  4 postgres postgres    34 Oct 29  2018 pg_multixact
>   67677559 drwx------.  5 postgres postgres    38 Oct 29  2018 base

base/ is the directory you need to look in. I'm guessing it is only
going to show the oid/ for the three db's above and  pgsql_tmp/

For more info on this see:
https://www.postgresql.org/docs/10/storage-file-layout.html

The below looks like the RH package installed data directory. Also looks
like it either has never had initdb run against or the files where removed.

I thought you said you had copied in data directories from the other
nodes, did I remember correctly?

> -bash-4.2$ cd
> -bash-4.2$ cd 10
> -bash-4.2$ ls
> backups  data
> -bash-4.2$ pwd
> /var/lib/pgsql/10
> -bash-4.2$ cd data/
> -bash-4.2$ ls -altri
> total 0
> 134734937 drwx------. 4 postgres postgres 31 May  8 06:25 ..
>     245519 drwx------. 2 postgres postgres  6 May  8 06:25 .
> -bash-4.2$ cd ..
> -bash-4.2$ pwd
> /var/lib/pgsql/10
> -bash-4.2$ cd ..
> -bash-4.2$


Yep, you remembered correctly.  

I copied the files as they were, out to a temporary folder under root for each node but never dug into base/ etc any further to check things.  So here's the state of things in the base/ folder of the backup of each node. 

[ PSQL03 ]
[root@psql03 base]# ls -altri
total 40
    42424 drwx------.  2 postgres postgres 8192 Oct 29  2018 1
 67714749 drwx------.  2 postgres postgres 8192 Oct 29  2018 13805
202037206 drwx------.  5 postgres postgres   38 Oct 29  2018 .
134312175 drwx------.  2 postgres postgres 8192 May 22 01:55 13806
    89714 drwxr-xr-x. 20 root     root     4096 May 22 22:43 ..
[root@psql03 base]#




[ PSQL02 ]
 [root@psql02 base]# ls -altri
total 412
201426668 drwx------.  2 postgres postgres  8192 Oct 29  2018 1
   743426 drwx------.  2 postgres postgres  8192 Mar 24 03:47 13805
135326327 drwx------.  2 postgres postgres 16384 Mar 24 20:15 40970
   451699 drwx------.  2 postgres postgres 40960 Mar 25 19:47 16395
  1441696 drwx------.  2 postgres postgres  8192 Mar 31 15:09 131137
 68396137 drwx------.  2 postgres postgres  8192 Mar 31 15:09 131138
135671065 drwx------.  2 postgres postgres  8192 Mar 31 15:09 131139
204353100 drwx------.  2 postgres postgres  8192 Mar 31 15:09 131140
135326320 drwx------. 17 postgres postgres  4096 Apr 14 10:08 .
 68574415 drwx------.  2 postgres postgres 12288 Apr 28 06:06 131142
   288896 drwx------.  2 postgres postgres 16384 Apr 28 06:06 131141
203015232 drwx------.  2 postgres postgres  8192 Apr 28 06:06 131136
135326328 drwx------.  2 postgres postgres 40960 May  5 22:09 24586
 67282461 drwx------.  2 postgres postgres  8192 May  5 22:09 13806
 67640961 drwx------.  2 postgres postgres 20480 May  5 22:09 131134
203500274 drwx------.  2 postgres postgres 16384 May  5 22:09 155710
134438257 drwxr-xr-x. 20 root     root      4096 May 22 01:44 ..
[root@psql02 base]# pwd
/root/postgres-patroni-backup/base
[root@psql02 base]#



[ PSQL01 ]
[root@psql01 base]# ls -altri
total 148
134704615 drwx------.  2 postgres postgres  8192 Oct 29  2018 1
201547700 drwx------.  2 postgres postgres  8192 Oct 29  2018 13805
   160398 drwx------.  2 postgres postgres  8192 Feb 24 23:53 13806
 67482137 drwx------.  7 postgres postgres    62 Feb 24 23:54 .
135909671 drwx------.  2 postgres postgres 24576 Feb 24 23:54 24586
134444555 drwx------.  2 postgres postgres 24576 Feb 24 23:54 16395
 67178716 drwxr-xr-x. 20 root     root      4096 May 22 01:53 ..
[root@psql01 base]# pwd
/root/postgresql-patroni-etcd/base
[root@psql01 base]#

Looks like this crash was far more catastrophic then I thought.  By the looks of things, thinking on psql02 would be my best bet.  



--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: psql: FATAL: the database system is starting up
Next
From: Adrian Klaver
Date:
Subject: Re: psql: FATAL: the database system is starting up