Infinite loop in GetMultiXactIdMembers() on 9.3.5 (involves pg_upgrade) - Mailing list pgsql-bugs

From Peter Geoghegan
Subject Infinite loop in GetMultiXactIdMembers() on 9.3.5 (involves pg_upgrade)
Date
Msg-id CAM3SWZTvy0w99Pe=cxKM_tXK_mx9phmXvvgW5scLZ4r4x2NyYw@mail.gmail.com
Whole thread Raw
Responses Re: Infinite loop in GetMultiXactIdMembers() on 9.3.5 (involves pg_upgrade)  (Peter Geoghegan <pg@heroku.com>)
List pgsql-bugs
A few weeks ago, a customer had a problem with their 9.3.5 database.
This was a database that pg_upgrade was run against at some point.
They were affected by the pg_upgrade + MultiXact truncation bug. I
manually performed the remediation suggested by the 9.3.5 release
notes: "In pg_upgrade, remove pg_multixact files left behind by
initdb". The query indicated that removing the offset file was the
correct course of action, and so it was duly removed. I checked at the
time, and the offset file looked like this:

$ pgdata/9.3/pg_multixact/offsets# od 0000
0000000 000000 000000 000000 000000 000000 000000 000000 000000
*
0020000

(so lots of zeroes, which I believe is consistent with an
after-initdb'd "0000" offset file from recent versions of Postgres).

This was escalated to me mainly because of some rather unpleasant
side-effects, and I don't think that those have been independently
reported in relation to either the truncation bug fixed in 9.3.5, or
any other bug: AV workers were stuck in uninterruptible states. Worse
still, certain queries with similar values in their predicate would
also get stuck in uninterruptible states, blocking on the autovacuum
worker which appeared to sit on a buffer lock.

My immediate-term solution was to kill the stuck "autovacuum: VACUUM
public.anonymized (to prevent wraparound)" workers - I had to try
something, since this was a production problem. Since they were
uninterruptible, this meant restarting the server aggressively.
Checkpoints couldn't finish at the time. I aggressively sent SIGKILL
to the workers to shut down, which naturally worked. Then, having
noted down the 3 relations that the autovacuum workers were blocked
on, I quickly performed a manual VACUUM FREEZE against those
relations, before autovacuum workers had the chance to try. For
whatever reason, this worked, and the VACUUM FREEZE finished almost
immediately for all 3 relations. AV workers naturally did not restart,
and for the time being, things calmed down.

It's my recollection that the need to freeze was fairly organic. I
believe that the age() of each relation, and our default
autovacuum_freeze_max_age setting were such that on the face of it,
the need to freeze the 3 relations was organic, and the
age(relfrozenxid) wasn't that far past the cutoff point for freezing.
In other words, I don't think this went unnoticed for that long,
especially since it caused several simple read queries touching the
same narrow band of values to become stuck. OTOH, I'm not accounting
for vacuum_multixact_freeze_table_age with that analysis, which is
also the default value for the setting (these things are seldom
touched).

The VACUUM FREEZE remediation did hold up for a couple of weeks, but I
did not suggest running VACUUM FREEZE against any other relation.
Anyway, the customer had exactly the same problem and same set of
symptoms today, but this time on another relation. It is still stuck
now, and I have yet to perform the same remediation (I expect that it
will work again on their production system, without incident). Details
of the relation in question:

customerdb=# select age(relfrozenxid), age(relminmxid), * from
pg_class where relname = 'anonymized';
-[ RECORD 1 ]--+--------------------------
age            | 203504871
age            | 226954224
relname        | anonymized
relnamespace   | 2200
reltype        | 17453
reloftype      | 0
relowner       | 16384
relam          | 0
relfilenode    | 17451
reltablespace  | 0
relpages       | 2671
reltuples      | 145973
relallvisible  | 0
reltoastrelid  | 1641288
reltoastidxid  | 0
relhasindex    | t
relisshared    | f
relpersistence | p
relkind        | r
relnatts       | 8
relchecks      | 0
relhasoids     | f
relhaspkey     | t
relhasrules    | f
relhastriggers | t
relhassubclass | f
relispopulated | t
relfrozenxid   | 28386252
relminmxid     | 4936915
relacl         | [null]
reloptions     | [null]

So AFAICT, the issue re-emerged in exactly the same way as before
(although checkpoints finished on this most recent occasion, so that's
a difference, though probably not a significant one).

The infinite loop within GetMultiXactIdMembers() runs as follows:

(gdb) n
1214 LWLockAcquire(MultiXactOffsetControlLock, LW_EXCLUSIVE);
(gdb)
1219 slotno = SimpleLruReadPage(MultiXactOffsetCtl, pageno, true, multi);
(gdb)
1220 offptr = (MultiXactOffset *)
MultiXactOffsetCtl->shared->page_buffer[slotno];
(gdb)
1222 offset = *offptr;
(gdb)
1242 if (tmpMXact < FirstMultiXactId)
(gdb)
1250 if (pageno != prev_pageno)
(gdb)
1255 nextMXOffset = *offptr;
(gdb)
1257 if (nextMXOffset == 0)
(gdb)
1260 LWLockRelease(MultiXactOffsetControlLock);
(gdb)
1261 pg_usleep(1000L);
(gdb)

Backtrace from within stuck autovacuum worker - where the above
infinite loop occurs:

#0  GetMultiXactIdMembers (allow_old=<optimized out>,
members=0x7fff2ee34620, multi=4920224) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/transam/multixact.c:1261
#1  GetMultiXactIdMembers (multi=4920224, members=0x7fff2ee34620,
allow_old=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/transam/multixact.c:1100
#2  0x00007f4600603a08 in FreezeMultiXactId (flags=<synthetic
pointer>, cutoff_multi=<optimized out>, cutoff_xid=181886916,
t_infomask=<optimized out>, multi=4920224) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:5379
#3  heap_prepare_freeze_tuple (tuple=0x7f447b297ae8,
cutoff_xid=181886916, cutoff_multi=<optimized out>,
frz=0x7f4601531638) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:5617
#4  0x00007f46007061aa in lazy_scan_heap (scan_all=1 '\001',
nindexes=2, Irel=<optimized out>, vacrelstats=<optimized out>,
onerel=0x7f4600508e10) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuumlazy.c:899
#5  lazy_vacuum_rel (onerel=0x7f4600508e10, vacstmt=<optimized out>,
bstrategy=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuumlazy.c:236
#6  0x00007f4600703c75 in vacuum_rel (relid=17451,
vacstmt=0x7fff2ee34e30, do_toast=0 '\000', for_wraparound=1 '\001') at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuum.c:1288
#7  0x00007f4600704b05 in vacuum (vacstmt=0x7fff2ee34e30,
relid=<optimized out>, do_toast=0 '\000', bstrategy=<optimized out>,
for_wraparound=1 '\001', isTopLevel=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/commands/vacuum.c:237
#8  0x00007f460079abfa in autovacuum_do_vac_analyze
(bstrategy=0x7f4601547d30, tab=0x7f46015480d0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:2811
#9  do_autovacuum () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:2329
#10 0x00007f460079b0f6 in AutoVacWorkerMain (argc=<optimized out>,
argv=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:1679
#11 0x00007f460079b1ba in StartAutoVacWorker () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/autovacuum.c:1464
#12 0x00007f46007a86d5 in StartAutovacuumWorker () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:5070
#13 sigusr1_handler (postgres_signal_arg=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:4725
#14 <signal handler called>
#15 0x00007f45fe652633 in select () from /lib/x86_64-linux-gnu/libc.so.6
#16 0x00007f46007a7943 in ServerLoop () at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1546
#17 PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/postmaster/postmaster.c:1253
#18 0x00007f46005dc687 in main (argc=5, argv=0x7f46013ee3b0) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/main/main.c:226

More details:

#3  heap_prepare_freeze_tuple (tuple=0x7f447b297ae8,
cutoff_xid=181886916, cutoff_multi=<optimized out>,
frz=0x7f4601531638) at
/tmp/buildd/postgresql-9.3-9.3.5/build/../src/backend/access/heap/heapam.c:5617
warning: Source file is more recent than executable.
5617 newxmax = FreezeMultiXactId(xid, tuple->t_infomask,
(gdb) p *tuple
$9 = {t_choice = {t_heap = {t_xmin = 96643451, t_xmax = 4920224,
t_field3 = {t_cid = 0, t_xvac = 0}}, t_datum = {datum_len_ = 96643451,
datum_typmod = 4920224, datum_typeid = 0}}, t_ctid = {ip_blkid =
{bi_hi = 0, bi_lo = 1743}, ip_posid = 15}, t_infomask2 = 8, t_infomask
= 6531, t_hoff = 24 '\030', t_bits =  <incomplete sequence \357>}
(gdb) p *frz
$10 = {xmax = 4920224, offset = 0, t_infomask2 = 8, t_infomask = 6531,
frzflags = 1 '\001'}

I have this physical restoration of the customer's database "on ice",
so I can run GDB on it at my leisure, if more details are required. I
think I could "fix" this on my toy instance, by once again manually
running VACUUM FREEZE, but it would be inconvenient to recreate the
problem following that (I'm pretty sure I could "fix" the problem that
way, though).

I have an independent report of a similar uninterruptible/undetected
deadlock on another 9.3.5 instance that also used pg_upgrade at some
point in the past, but I do not believe that any offset file was ever
deleted from that instance, and in any case the query from the 9.3.5
release notes did not indicate that that should happen now. It might
just be a coincidence, because I haven't even verified that autovacuum
was ever involved in that other incident, since someone else took
matters into their own hands before I was informed (they did not know
to run VACUUM FREEZE, whereas anti-wraparound were vacuums frequently
needed with the first customer). I'm keeping an eye on that one too.
it could be nothing, though, or an entirely unrelated issue.
--
Peter Geoghegan

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #12670: Can't create postgis extension in single-user mode
Next
From: Peter Geoghegan
Date:
Subject: Re: Infinite loop in GetMultiXactIdMembers() on 9.3.5 (involves pg_upgrade)