Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations
Date
Msg-id 20220331042038.udnydrj4x45i4khb@alap3.anarazel.de
Whole thread Raw
In response to Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Andres Freund <andres@anarazel.de>)
Responses Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Hi,

On 2022-03-30 21:04:07 -0700, Andres Freund wrote:
> On 2022-03-30 20:35:25 -0700, Peter Geoghegan wrote:
> > On Wed, Mar 30, 2022 at 8:28 PM Andres Freund <andres@anarazel.de> wrote:
> > > I triggered twice now, but it took a while longer the second time.
> >
> > Great.
> >
> > I wonder if you can get an RR recording...
>
> Started it, but looks like it's too slow.
>
> (gdb) p MyProcPid
> $1 = 2172500
>
> (gdb) p vacrel->NewRelfrozenXid
> $3 = 717
> (gdb) p vacrel->relfrozenxid
> $4 = 717
> (gdb) p OldestXmin
> $5 = 5112
> (gdb) p aggressive
> $6 = false

I added a bunch of debug elogs to see what sets *frozenxid_updated to true.

(gdb) p *vacrel
$1 = {rel = 0x7fe24f3e0148, indrels = 0x7fe255c17ef8, nindexes = 2, aggressive = false, skipwithvm = true,
failsafe_active= false,
 
  consider_bypass_optimization = true, do_index_vacuuming = true, do_index_cleanup = true, do_rel_truncate = true,
bstrategy= 0x7fe255bb0e28, pvs = 0x0,
 
  relfrozenxid = 717, relminmxid = 6, old_live_tuples = 42, OldestXmin = 20751, vistest = 0x7fe255058970
<GlobalVisSharedRels>,FreezeLimit = 4244988047,
 
  MultiXactCutoff = 4289967302, NewRelfrozenXid = 717, NewRelminMxid = 6, skippedallvis = false, relnamespace =
0x7fe255c17bf8"pg_catalog",
 
  relname = 0x7fe255c17cb8 "pg_database", indname = 0x0, blkno = 4294967295, offnum = 0, phase =
VACUUM_ERRCB_PHASE_SCAN_HEAP,verbose = false,
 
  dead_items = 0x7fe255c131d0, rel_pages = 8, scanned_pages = 8, removed_pages = 0, lpdead_item_pages = 0,
missed_dead_pages= 0, nonempty_pages = 8,
 
  new_rel_tuples = 124, new_live_tuples = 42, indstats = 0x7fe255c18320, num_index_scans = 0, tuples_deleted = 0,
lpdead_items= 0, live_tuples = 42,
 
  recently_dead_tuples = 82, missed_dead_tuples = 0}

But the debug elog reports that

relfrozenxid updated 714 -> 717
relminmxid updated 1 -> 6

Tthe problem is that the crashing backend reads the relfrozenxid/relminmxid
from the shared relcache init file written by another backend:

2022-03-30 21:10:47.626 PDT [2625038][autovacuum worker][6/433:0][] LOG:  automatic vacuum of table
"contrib_regression_postgres_fdw.pg_catalog.pg_database":index scans: 1
 
        pages: 0 removed, 8 remain, 8 scanned (100.00% of total)
        tuples: 4 removed, 114 remain, 72 are dead but not yet removable
        removable cutoff: 20751, older by 596 xids when operation ended
        new relfrozenxid: 717, which is 3 xids ahead of previous value
        new relminmxid: 6, which is 5 mxids ahead of previous value
        index scan needed: 3 pages from table (37.50% of total) had 8 dead item identifiers removed
        index "pg_database_datname_index": pages: 2 in total, 0 newly deleted, 0 currently deleted, 0 reusable
        index "pg_database_oid_index": pages: 6 in total, 0 newly deleted, 2 currently deleted, 2 reusable
        I/O timings: read: 0.050 ms, write: 0.102 ms
        avg read rate: 209.860 MB/s, avg write rate: 76.313 MB/s
        buffer usage: 42 hits, 22 misses, 8 dirtied
        WAL usage: 13 records, 5 full page images, 33950 bytes
        system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
...
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][:0][] DEBUG:  InitPostgres
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/0:0][] DEBUG:  my backend ID is 6
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/0:0][] LOG:  reading shared init file
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/443:0][] DEBUG:  StartTransaction(1) name: unnamed;
blockState:DEFAULT; state: INPROGRESS, xid/sub>
 
2022-03-30 21:10:47.772 PDT [2625043][autovacuum worker][6/443:0][] LOG:  reading non-shared init file

This is basically the inverse of a54e1f15 - we read a *newer* horizon. That's
normally fairly harmless - I think.

Perhaps we should just fetch the horizons from the "local" catalog for shared
rels?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Next
From: Andres Freund
Date:
Subject: Re: Removing more vacuumlazy.c special cases, relfrozenxid optimizations