Thread: datfrozenxid not dropping after vacuum

datfrozenxid not dropping after vacuum

From
Matthew Tice
Date:
Hi,

Starting this morning at 0830 local time I noticed that my
datfrozenxid starts moving past the `autovacuum_freeze_max_age` value
of 200000000.   When we encountered this in the past the solution has
been to do one of the following:

1. This is related an error similar to
```
found xmin 2675436435 from before relfrozenxid 321165377
```
Where the solution has been to move the `pg_internal.init` file out of
the way and let Postgresql recreate it.  Or;

2. A long-running transaction.  Typically I'll just find the `idle in
transaction` transactions that have a `query_start` around when my
alarm went off notifying me when `datfrozenxid` breaches
`autovacuum_freeze_max_age`.  Using a query similar to
```
SELECT pid, query_start, datname, usename, state, backend_xmin,
age(backend_xmin)
FROM pg_stat_activity
WHERE state = 'idle in transaction';
```

3. The autovacuum process seemed to be "stuck" on a particular table.
We would kill the pid of the autovacuum process.

The problem is that neither of these solutions have seemed to drop
`datfrozenxid` back down and there is one specific database in this
cluster that's holding onto it.

Using these queries from CrunchyData:

# Show oldest current xid
# WITH max_age AS (
    SELECT 2000000000 as max_old_xid
        , setting AS autovacuum_freeze_max_age
        FROM pg_catalog.pg_settings
        WHERE name = 'autovacuum_freeze_max_age' )
, per_database_stats AS (
    SELECT datname
        , m.max_old_xid::int
        , m.autovacuum_freeze_max_age::int
        , age(d.datfrozenxid) AS oldest_current_xid
    FROM pg_catalog.pg_database d
    JOIN max_age m ON (true)
    WHERE d.datallowconn )
SELECT max(oldest_current_xid) AS oldest_current_xid
    , max(ROUND(100*(oldest_current_xid/max_old_xid::float))) AS
percent_towards_wraparound
    , max(ROUND(100*(oldest_current_xid/autovacuum_freeze_max_age::float)))
AS percent_towards_emergency_autovac
FROM per_database_stats;

 oldest_current_xid | percent_towards_wraparound |
percent_towards_emergency_autovac
--------------------+----------------------------+-----------------------------------
          230935699 |                         12 |
          115

# Top 8 individual databases
SELECT datname
    , age(datfrozenxid)
    , current_setting('autovacuum_freeze_max_age')
FROM pg_database
ORDER BY 2 DESC LIMIT 8;

          datname          |    age    | current_setting
---------------------------+-----------+-----------------
 siteservice  | 230935699 | 200000000
 coupon-ws           | 217613246 | 200000000
 contextchangestore | 211343280 | 200000000
 template0                 | 210351639 | 200000000
 productmanager | 207876167 | 200000000
 rhistory              | 207876167 | 200000000
 smsservice   | 207876167 | 200000000
 techservice      | 205432524 | 200000000

That top database `siteservice` is the "problematic" one where a
manual vacuum won't coerce it to free up the `datfrozenxid`.

Looking at the tables in that database:

# SELECT c.oid::regclass
    , age(c.relfrozenxid)
    , pg_size_pretty(pg_total_relation_size(c.oid))
FROM pg_class c
JOIN pg_namespace n on c.relnamespace = n.oid
WHERE relkind IN ('r', 't', 'm')
AND n.nspname NOT IN ('pg_toast')
ORDER BY 2 DESC LIMIT 10;

       oid         |    age    | pg_size_pretty
--------------------+-----------+----------------
 pg_database        | 230935699 | 4264 kB
 pg_proc            |  93543215 | 976 kB
 pg_collation       |  93543215 | 560 kB
 pg_attribute       |  93543215 | 600 kB
 pg_shdepend        |  59515320 | 15 MB
 pg_statistic       |  53828900 | 464 kB
 pg_subscription    |  53172718 | 16 kB
 pg_pltemplate      |  53172718 | 56 kB
 pg_authid          |  53172718 | 8616 kB
 pg_db_role_setting |  53172718 | 64 kB

I thought maybe it had to do with my replication slots somehow:

# select slot_name, slot_type, database, active, catalog_xmin,
restart_lsn, confirmed_flush_lsn from pg_replication_slots ;
               slot_name                | slot_type |
database            | active | catalog_xmin |  restart_lsn  |
confirmed_flush_lsn

----------------------------------------+-----------+-------------------------------+--------+--------------+---------------+---------------------
 dbs1db02                               | physical  |
             | f      |              |               |
 dbs1db01                               | physical  |
             | t      |              | 4D25/ACE6EE08 |
 dbs1db03                               | physical  |
             | t      |              | 4D25/ACE6EE08 |
 dbs2db01                               | physical  |
             | t      |              | 4D25/ACE6EE08 |
 debezium_cmanager              | logical   | campaign-manager
     | t      |   2152258063 | 4D25/A421A6C8 | 4D25/ABC18C88
 debezium_rservice        | logical   | retail-content-service
| t      |   2152238060 | 4D25/8EC403B0 | 4D25/A6105DF8
 debezium_partnerservice | logical   | partnerservice | t      |
2152238060 | 4D25/8EC403B0 | 4D25/A5446630
 dbs1_dr                                | physical  |
             | f      |              |               |

So I either restarted the physical standbys or I restarted the logical
connections (Debezium -> Kafka).

I'm also working on a:
# vacuumdb --all --jobs=10 --echo --analyze

But this is still running (and honestly I'm not really hopeful it's
going to do the trick).

My primary, read/write database is Postgresql 10.4 (CentOS 7) while my
standby databases have been patched to 10.17.

Any thoughts on this?

Thanks,Matt



Re: datfrozenxid not dropping after vacuum

From
Alvaro Herrera
Date:
On 2021-Sep-01, Matthew Tice wrote:

[ problem table is pg_database ]

> My primary, read/write database is Postgresql 10.4 (CentOS 7) while my
> standby databases have been patched to 10.17.

Hmm, I think there was a bug in the early 10.x versions where advancing
the xid age of shared tables would not work correctly for some reason ...
Ah yes, this was fixed in 10.5, a mere three years ago:

Author: Andres Freund <andres@anarazel.de>
Branch: master Release: REL_11_BR [a54e1f158] 2018-06-12 11:13:21 -0700
Branch: REL_10_STABLE Release: REL_10_5 [2ce64caaf] 2018-06-12 11:13:21 -0700
Branch: REL9_6_STABLE Release: REL9_6_10 [6a46aba1c] 2018-06-12 11:13:21 -0700
Branch: REL9_5_STABLE Release: REL9_5_14 [14b3ec6f3] 2018-06-12 11:13:21 -0700
Branch: REL9_4_STABLE Release: REL9_4_19 [817f9f9a8] 2018-06-12 11:13:22 -0700
Branch: REL9_3_STABLE Release: REL9_3_24 [9b9b622b2] 2018-06-12 11:13:22 -0700

    Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
    
    When vacuum processes a relation it uses the corresponding relcache
    entry's relfrozenxid / relminmxid as a cutoff for when to remove
    tuples etc. Unfortunately for nailed relations (i.e. critical system
    catalogs) bugs could frequently lead to the corresponding relcache
    entry being stale.
    
    This set of bugs could cause actual data corruption as vacuum would
    potentially not remove the correct row versions, potentially reviving
    them at a later point.  After 699bf7d05c some corruptions in this vein
    were prevented, but the additional error checks could also trigger
    spuriously. Examples of such errors are:
      ERROR: found xmin ... from before relfrozenxid ...
    and
      ERROR: found multixact ... from before relminmxid ...
    To be caused by this bug the errors have to occur on system catalog
    tables.
    
    The two bugs are:
    
    1) Invalidations for nailed relations were ignored, based on the
       theory that the relcache entry for such tables doesn't
       change. Which is largely true, except for fields like relfrozenxid
       etc.  This means that changes to relations vacuumed in other
       sessions weren't picked up by already existing sessions.  Luckily
       autovacuum doesn't have particularly longrunning sessions.
    
    2) For shared *and* nailed relations, the shared relcache init file
       was never invalidated while running.  That means that for such
       tables (e.g. pg_authid, pg_database) it's not just already existing
       sessions that are affected, but even new connections are as well.
       That explains why the reports usually were about pg_authid et. al.
    
    To fix 1), revalidate the rd_rel portion of a relcache entry when
    invalid. This implies a bit of extra complexity to deal with
    bootstrapping, but it's not too bad.  The fix for 2) is simpler,
    simply always remove both the shared and local init files.
    
    Author: Andres Freund
    Reviewed-By: Alvaro Herrera
    Discussion:
        https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
        https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
        https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
        https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
    Backpatch: 9.3-


-- 
Álvaro Herrera           39°49'30"S 73°17'W
"El número de instalaciones de UNIX se ha elevado a 10,
y se espera que este número aumente" (UPM, 1972)



Re: datfrozenxid not dropping after vacuum

From
Matthew Tice
Date:
Hi Alvaro, thanks for the quick reply.

I'm scheduled to do my patching maintenance at the end of this month -
but at this point I don't think I'm going to make it.

Other than patching, is there a work around?  For example, in #2 above:
>The fix for 2) is simpler,
>    simply always remove both the shared and local init files.

I'm not familiar with the differences between 'shared' and 'local'
init files (I'd imagine I referenced a 'local' file in my original
post)?



Thanks!

Matt

On Wed, Sep 1, 2021 at 3:00 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Sep-01, Matthew Tice wrote:
>
> [ problem table is pg_database ]
>
> > My primary, read/write database is Postgresql 10.4 (CentOS 7) while my
> > standby databases have been patched to 10.17.
>
> Hmm, I think there was a bug in the early 10.x versions where advancing
> the xid age of shared tables would not work correctly for some reason ...
> Ah yes, this was fixed in 10.5, a mere three years ago:
>
> Author: Andres Freund <andres@anarazel.de>
> Branch: master Release: REL_11_BR [a54e1f158] 2018-06-12 11:13:21 -0700
> Branch: REL_10_STABLE Release: REL_10_5 [2ce64caaf] 2018-06-12 11:13:21 -0700
> Branch: REL9_6_STABLE Release: REL9_6_10 [6a46aba1c] 2018-06-12 11:13:21 -0700
> Branch: REL9_5_STABLE Release: REL9_5_14 [14b3ec6f3] 2018-06-12 11:13:21 -0700
> Branch: REL9_4_STABLE Release: REL9_4_19 [817f9f9a8] 2018-06-12 11:13:22 -0700
> Branch: REL9_3_STABLE Release: REL9_3_24 [9b9b622b2] 2018-06-12 11:13:22 -0700
>
>     Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
>
>     When vacuum processes a relation it uses the corresponding relcache
>     entry's relfrozenxid / relminmxid as a cutoff for when to remove
>     tuples etc. Unfortunately for nailed relations (i.e. critical system
>     catalogs) bugs could frequently lead to the corresponding relcache
>     entry being stale.
>
>     This set of bugs could cause actual data corruption as vacuum would
>     potentially not remove the correct row versions, potentially reviving
>     them at a later point.  After 699bf7d05c some corruptions in this vein
>     were prevented, but the additional error checks could also trigger
>     spuriously. Examples of such errors are:
>       ERROR: found xmin ... from before relfrozenxid ...
>     and
>       ERROR: found multixact ... from before relminmxid ...
>     To be caused by this bug the errors have to occur on system catalog
>     tables.
>
>     The two bugs are:
>
>     1) Invalidations for nailed relations were ignored, based on the
>        theory that the relcache entry for such tables doesn't
>        change. Which is largely true, except for fields like relfrozenxid
>        etc.  This means that changes to relations vacuumed in other
>        sessions weren't picked up by already existing sessions.  Luckily
>        autovacuum doesn't have particularly longrunning sessions.
>
>     2) For shared *and* nailed relations, the shared relcache init file
>        was never invalidated while running.  That means that for such
>        tables (e.g. pg_authid, pg_database) it's not just already existing
>        sessions that are affected, but even new connections are as well.
>        That explains why the reports usually were about pg_authid et. al.
>
>     To fix 1), revalidate the rd_rel portion of a relcache entry when
>     invalid. This implies a bit of extra complexity to deal with
>     bootstrapping, but it's not too bad.  The fix for 2) is simpler,
>     simply always remove both the shared and local init files.
>
>     Author: Andres Freund
>     Reviewed-By: Alvaro Herrera
>     Discussion:
>         https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
>         https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
>         https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
>         https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
>     Backpatch: 9.3-
>
>
> --
> Álvaro Herrera           39°49'30"S 73°17'W
> "El número de instalaciones de UNIX se ha elevado a 10,
> y se espera que este número aumente" (UPM, 1972)



Re: datfrozenxid not dropping after vacuum

From
Matthew Tice
Date:
Interestingly enough, I hopped on the database system this morning and
found the `datfrozenxid` dropped back down below
`autovacuum_freeze_max_age` around 0200 local time (roughly 18 hours
after the fact).

Looking through the Postgresql logs I don't see anything standing out
at that time.

I still plan on patching to 10.17 tonight.

Matt

On Wed, Sep 1, 2021 at 4:01 PM Matthew Tice <mjtice@gmail.com> wrote:
>
> Hi Alvaro, thanks for the quick reply.
>
> I'm scheduled to do my patching maintenance at the end of this month -
> but at this point I don't think I'm going to make it.
>
> Other than patching, is there a work around?  For example, in #2 above:
> >The fix for 2) is simpler,
> >    simply always remove both the shared and local init files.
>
> I'm not familiar with the differences between 'shared' and 'local'
> init files (I'd imagine I referenced a 'local' file in my original
> post)?
>
>
>
> Thanks!
>
> Matt
>
> On Wed, Sep 1, 2021 at 3:00 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >
> > On 2021-Sep-01, Matthew Tice wrote:
> >
> > [ problem table is pg_database ]
> >
> > > My primary, read/write database is Postgresql 10.4 (CentOS 7) while my
> > > standby databases have been patched to 10.17.
> >
> > Hmm, I think there was a bug in the early 10.x versions where advancing
> > the xid age of shared tables would not work correctly for some reason ...
> > Ah yes, this was fixed in 10.5, a mere three years ago:
> >
> > Author: Andres Freund <andres@anarazel.de>
> > Branch: master Release: REL_11_BR [a54e1f158] 2018-06-12 11:13:21 -0700
> > Branch: REL_10_STABLE Release: REL_10_5 [2ce64caaf] 2018-06-12 11:13:21 -0700
> > Branch: REL9_6_STABLE Release: REL9_6_10 [6a46aba1c] 2018-06-12 11:13:21 -0700
> > Branch: REL9_5_STABLE Release: REL9_5_14 [14b3ec6f3] 2018-06-12 11:13:21 -0700
> > Branch: REL9_4_STABLE Release: REL9_4_19 [817f9f9a8] 2018-06-12 11:13:22 -0700
> > Branch: REL9_3_STABLE Release: REL9_3_24 [9b9b622b2] 2018-06-12 11:13:22 -0700
> >
> >     Fix bugs in vacuum of shared rels, by keeping their relcache entries current.
> >
> >     When vacuum processes a relation it uses the corresponding relcache
> >     entry's relfrozenxid / relminmxid as a cutoff for when to remove
> >     tuples etc. Unfortunately for nailed relations (i.e. critical system
> >     catalogs) bugs could frequently lead to the corresponding relcache
> >     entry being stale.
> >
> >     This set of bugs could cause actual data corruption as vacuum would
> >     potentially not remove the correct row versions, potentially reviving
> >     them at a later point.  After 699bf7d05c some corruptions in this vein
> >     were prevented, but the additional error checks could also trigger
> >     spuriously. Examples of such errors are:
> >       ERROR: found xmin ... from before relfrozenxid ...
> >     and
> >       ERROR: found multixact ... from before relminmxid ...
> >     To be caused by this bug the errors have to occur on system catalog
> >     tables.
> >
> >     The two bugs are:
> >
> >     1) Invalidations for nailed relations were ignored, based on the
> >        theory that the relcache entry for such tables doesn't
> >        change. Which is largely true, except for fields like relfrozenxid
> >        etc.  This means that changes to relations vacuumed in other
> >        sessions weren't picked up by already existing sessions.  Luckily
> >        autovacuum doesn't have particularly longrunning sessions.
> >
> >     2) For shared *and* nailed relations, the shared relcache init file
> >        was never invalidated while running.  That means that for such
> >        tables (e.g. pg_authid, pg_database) it's not just already existing
> >        sessions that are affected, but even new connections are as well.
> >        That explains why the reports usually were about pg_authid et. al.
> >
> >     To fix 1), revalidate the rd_rel portion of a relcache entry when
> >     invalid. This implies a bit of extra complexity to deal with
> >     bootstrapping, but it's not too bad.  The fix for 2) is simpler,
> >     simply always remove both the shared and local init files.
> >
> >     Author: Andres Freund
> >     Reviewed-By: Alvaro Herrera
> >     Discussion:
> >         https://postgr.es/m/20180525203736.crkbg36muzxrjj5e@alap3.anarazel.de
> >         https://postgr.es/m/CAMa1XUhKSJd98JW4o9StWPrfS=11bPgG+_GDMxe25TvUY4Sugg@mail.gmail.com
> >         https://postgr.es/m/CAKMFJucqbuoDRfxPDX39WhA3vJyxweRg_zDVXzncr6+5wOguWA@mail.gmail.com
> >         https://postgr.es/m/CAGewt-ujGpMLQ09gXcUFMZaZsGJC98VXHEFbF-tpPB0fB13K+A@mail.gmail.com
> >     Backpatch: 9.3-
> >
> >
> > --
> > Álvaro Herrera           39°49'30"S 73°17'W
> > "El número de instalaciones de UNIX se ha elevado a 10,
> > y se espera que este número aumente" (UPM, 1972)



Re: datfrozenxid not dropping after vacuum

From
Alvaro Herrera
Date:
On 2021-Sep-01, Matthew Tice wrote:

> Hi Alvaro, thanks for the quick reply.

Hi. Glad to hear that your problem is now behind.

> I'm scheduled to do my patching maintenance at the end of this month -
> but at this point I don't think I'm going to make it.
> 
> Other than patching, is there a work around?

Hm, in my previous reply I had written a suggestion to vacuum
pg_database in the offending database after deleting the
pg_internal.init file, but evidently I edited it out before sending.
(Probably because I wasn't sure if you need to delete file, connect,
vacuum, or rather connect, delete file, vacuum.)

> For example, in #2 above:
> >The fix for 2) is simpler,
> >    simply always remove both the shared and local init files.
> 
> I'm not familiar with the differences between 'shared' and 'local'
> init files (I'd imagine I referenced a 'local' file in my original
> post)?

The global file is in the global/ subdirectory of the data directory,
and the "local" ones are each in the corresponding database directory:

cd $PGDATA
$ find . -name pg_internal.init
./base/12758/pg_internal.init
./base/46212/pg_internal.init
./global/pg_internal.init

$ psql -c "select oid, datname from pg_database"
  oid  |  datname   
-------+------------
 12757 | postgres
 12758 | alvherre
     1 | template1
 12756 | template0
 46212 | regression
(5 filas)

So in the above there are cache files for databases regression and
alvherre, plus the global one.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede" (Mark Twain)