Home > mailing lists

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid - Mailing list pgsql-general

From	Jeremy Finzel
Subject	Re: found xmin from before relfrozenxid on pg_catalog.pg_authid
Date	May 23, 2018 19:52:13
Msg-id	CAMa1XUjOhoxrcdtTPyxBU_CUwjii2Bkc1Fdnto4K-oATDAv4CA@mail.gmail.com Whole thread Raw
In response to	Re: found xmin from before relfrozenxid on pg_catalog.pg_authid (Maxim Boguk <maxim.boguk@gmail.com>)
List	pgsql-general

Tree view

On Tue, May 22, 2018 at 2:42 PM, Maxim Boguk <maxim.boguk@gmail.com> wrote:

On Tue, May 22, 2018 at 10:30 PM, Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-05-22 22:18:15 +0300, Maxim Boguk wrote:
> On Tue, May 22, 2018 at 9:47 PM, Andres Freund <andres@anarazel.de> wrote:
> > > select relfrozenxid from pg_class where relname='pg_authid';
> > > relfrozenxid
> > > --------------
> > > 2863429136

> select txid_current();
> txid_current
> --------------
> 41995913769

So that's an xid of 3341208114, if you leave the epoch out. What's

autovacuum_freeze_max_age set to in that cluster?

postgres=# show autovacuum_freeze_max_age;
autovacuum_freeze_max_age
---------------------------
200000000
(default value I think)

Can you show pg_controldata output, and

relminmxid from that cluster?

postgres@db:~$ /usr/lib/postgresql/9.6/bin/pg_controldata -D /var/lib/postgresql/9.6/main
pg_control version number:            960
Catalog version number:               201608131
Database system identifier:           6469368654711450114
Database cluster state:               in production
pg_control last modified:             Tue 22 May 2018 10:20:14 PM MSK
Latest checkpoint location:           CCB5/F9C37950
Prior checkpoint location:            CCB0/43F316B0
Latest checkpoint's REDO location:    CCB1/6706BD88
Latest checkpoint's REDO WAL file:    000000010000CCB100000067
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          9:3341161759
Latest checkpoint's NextOID:          190071899
Latest checkpoint's NextMultiXactId: 59416233
Latest checkpoint's NextMultiOffset: 215588532
Latest checkpoint's oldestXID:        2814548646
Latest checkpoint's oldestXID's DB:   16400
Latest checkpoint's oldestActiveXID: 3341161759
Latest checkpoint's oldestMultiXid:   54264778
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:2814548646
Latest checkpoint's newestCommitTsXid:3341161758
Time of latest checkpoint:            Tue 22 May 2018 10:05:16 PM MSK
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                on
max_connections setting:              2000
max_worker_processes setting:         8
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       on
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0

postgres=# select datname,datfrozenxid,datminmxid from pg_database order by datname;
datname | datfrozenxid | datminmxid
-----------+--------------+------------
**        |   2815939794 |   54265194
postgres |   2863429136 |   54280819
template0 |   3148297669 |   59342106
template1 |   2816765546 |   59261794

I might be daft here, but it's surely curious that the relfrozenxid from
the error and pg_catalog are really different (number of digits):
catalog: 2863429136
error: 248712603

> About gdb bt - it's tricky because it is mission critical master db of
> huge project.
> I'll will try promote backup replica and check is issue persist there and
> if yes - we will have our playground for a while, but it will require
> sometime to arrange.

You should be ok to just bt that in the running cluster, but I
definitely understand if you don't want to do that... I'd appreciate if
you set up the a playground, because this seems like something that'll
reappear.

My gdb kunf-fu exceptionally rusty now, so I better play in sandbox.
And if error did't exist on fresh promoted replica it will give us useful info as well.

--
Maxim Boguk
Senior Postgresql DBA
http://dataegret.com/

Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk

"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"

All -

We just noticed that this exact same problem has now returned for us. It has exactly the same symptoms as my original post (including the fact that a snapshot does not have the issue).

I am open to all suggestions as to troubleshooting this further!

Thank you,

Jeremy

pgsql-general by date:

From: "a"
Date: 23 May 2018, 19:50:28
Subject: Re:RE: RE: How do I select composite array element that satisfy specific conditions.

From: "a"
Date: 23 May 2018, 20:03:13
Subject: Re:How do I copy an element of composite type array into csv file?

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid - Mailing list pgsql-general

Previous

Next