Re: found xmin from before relfrozenxid on pg_catalog.pg_authid - Mailing list pgsql-general
From | Jeremy Finzel |
---|---|
Subject | Re: found xmin from before relfrozenxid on pg_catalog.pg_authid |
Date | |
Msg-id | CAMa1XUjOhoxrcdtTPyxBU_CUwjii2Bkc1Fdnto4K-oATDAv4CA@mail.gmail.com Whole thread Raw |
In response to | Re: found xmin from before relfrozenxid on pg_catalog.pg_authid (Maxim Boguk <maxim.boguk@gmail.com>) |
List | pgsql-general |
On Tue, May 22, 2018 at 2:42 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:
On Tue, May 22, 2018 at 10:30 PM, Andres Freund <andres@anarazel.de> wrote:Hi,
On 2018-05-22 22:18:15 +0300, Maxim Boguk wrote:
> On Tue, May 22, 2018 at 9:47 PM, Andres Freund <andres@anarazel.de> wrote:
> > > select relfrozenxid from pg_class where relname='pg_authid';
> > > relfrozenxid
> > > --------------
> > > 2863429136
> select txid_current();
> txid_current
> --------------
> 41995913769
So that's an xid of 3341208114, if you leave the epoch out. What'sautovacuum_freeze_max_age set to in that cluster?postgres=# show autovacuum_freeze_max_age;
autovacuum_freeze_max_age
---------------------------
200000000(default value I think)Can you show pg_controldata output, andrelminmxid from that cluster?postgres@db:~$ /usr/lib/postgresql/9.6/bin/pg_controldata -D /var/lib/postgresql/9.6/main
pg_control version number: 960
Catalog version number: 201608131
Database system identifier: 6469368654711450114
Database cluster state: in production
pg_control last modified: Tue 22 May 2018 10:20:14 PM MSK
Latest checkpoint location: CCB5/F9C37950
Prior checkpoint location: CCB0/43F316B0
Latest checkpoint's REDO location: CCB1/6706BD88
Latest checkpoint's REDO WAL file: 000000010000CCB100000067
Latest checkpoint's TimeLineID: 1
Latest checkpoint's PrevTimeLineID: 1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 9:3341161759
Latest checkpoint's NextOID: 190071899
Latest checkpoint's NextMultiXactId: 59416233
Latest checkpoint's NextMultiOffset: 215588532
Latest checkpoint's oldestXID: 2814548646
Latest checkpoint's oldestXID's DB: 16400
Latest checkpoint's oldestActiveXID: 3341161759
Latest checkpoint's oldestMultiXid: 54264778
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:2814548646
Latest checkpoint's newestCommitTsXid:3341161758
Time of latest checkpoint: Tue 22 May 2018 10:05:16 PM MSK
Fake LSN counter for unlogged rels: 0/1
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: replica
wal_log_hints setting: on
max_connections setting: 2000
max_worker_processes setting: 8
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: on
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float4 argument passing: by value
Float8 argument passing: by value
Data page checksum version: 0
postgres=# select datname,datfrozenxid,datminmxid from pg_database order by datname;
datname | datfrozenxid | datminmxid
-----------+--------------+------------
** | 2815939794 | 54265194
postgres | 2863429136 | 54280819
template0 | 3148297669 | 59342106
template1 | 2816765546 | 59261794
I might be daft here, but it's surely curious that the relfrozenxid from
the error and pg_catalog are really different (number of digits):
catalog: 2863429136
error: 248712603
> About gdb bt - it's tricky because it is mission critical master db of
> huge project.
> I'll will try promote backup replica and check is issue persist there and
> if yes - we will have our playground for a while, but it will require
> sometime to arrange.
You should be ok to just bt that in the running cluster, but I
definitely understand if you don't want to do that... I'd appreciate if
you set up the a playground, because this seems like something that'll
reappear.My gdb kunf-fu exceptionally rusty now, so I better play in sandbox.And if error did't exist on fresh promoted replica it will give us useful info as well.
--Maxim Boguk
Senior Postgresql DBA
http://dataegret.com/
Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
All -
We just noticed that this exact same problem has now returned for us. It has exactly the same symptoms as my original post (including the fact that a snapshot does not have the issue).
I am open to all suggestions as to troubleshooting this further!
Thank you,
Jeremy
pgsql-general by date: