BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated) - Mailing list pgsql-bugs

From tgarnett@panjiva.com
Subject BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)
Date
Msg-id 20150406192130.2573.22545@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Kevin Grittner <kgrittn@ymail.com>)
Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Bruce Momjian <bruce@momjian.us>)
Re: BUG #12990: Missing pg_multixact/members files (appears to have wrapped, then truncated)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      12990
Logged by:          Timothy Garnett
Email address:      tgarnett@panjiva.com
PostgreSQL version: 9.3.5
Operating system:   Ubuntu Linux x86_64 12.04.5 LTS
Description:

At 4/1 5:10pm our production database started throwing these kinds of errors
(all of our hot spares were also similarly corrupted):

ERROR: could not access status of transaction 303450738
DETAIL: Could not open file "pg_multixact/members/7B49": No such file or
directory.

This is possibly related to bug 8673 (though the mailing list sounds like
most issues there were resolved by 9.3.4 and we are on 9.3.5) and also of
note this db cluster was pg_upgraded from 9.2 at one point (to 9.3.x then
further upgraded to 9.3.5)(see
http://www.postgresql.org/message-id/CAAS3ty+2ynCyf_YmRn6WuqSF8EmJMDypAkc7uD_EXTZJ7usOSg@mail.gmail.com
, though it doesn't seem like that's involved here).

We have a file-system level snapshot of a hot spare of the db from about 28
hours before the errors started and sufficient wal files to do point in time
recovery. We recovered a copy of the database to about 1 hour before the
errors started and promoted that to our production db (losing a couple of
hours of commits).

Walking through the point in time recovery we see the pg_multixact/members
folder fill up and eventually suffer a massive truncation when the newest
file overtakes the oldest file. The range from oldest to newest (through the
wrap point 14078) appears to be continuous.

Time Newest-File Oldest-File
1 am 72BA 98A4(last Oct.)
10am 72BC 98A4
11am 72BD 98A4
12pm 7C4E 98A4
1 pm 7E88 98A4
2 pm 7FFF 98A4
3 pm 884E 98A4
4 pm 905F 98A4 * here we forked our new production
4:30 94AF 98D2
5pm  984D 98D2
5:15 98D2 9900 * now errors, all files outside of 98D2-9900 are gone

During the rapid growth we were generating 10-12 pg_multixact/members files
per minute.

After we forked off a new production we identified the workload that was
causing this rapid growth and stopped it.  We then used vacuumdb -a -F to
vacuum freeze the whole database cluster. Our presumption was that this
would free up the members files, but it only freed up a few. The current
oldest file is 9E30 (~Jan. 1st) so moved up from 98A4 and the newest file is
906A.  We're concerned this might be a time-bomb waiting for us in the
future (though at the slower growth rate of 16 files a day or so potentially
a while in the future) as the members file namespace is still 95+% consumed
post vacuum freeze.  We do plan to upgrade to 9.4 sometime in the next
couple of months and are curious if we can use pg_upgrade or if we will need
to dump / restore the full (multi TiB) cluster.

As for the workload causing the rapid growth, it involved something like:
while [many millions of things to update / insert]
  BEGIN;
  SELECT state FROM table_A WHERE id = 1 FOR SHARE;
  if state != 'blocked'
    update / insert 2000-10000 rows in table_B, other stuff
    COMMIT;
  else
    COMMIT;
    sleep wait
being run in a bunch of connections (26) to the db. Row 1 of table A was
being used effectively as a share / exclusive lock as another different
process would update the state to 'blocked' to block the first process in
order to manage some shared outside the db state.

We've retained the pre-failure snapshot and wal files for now (though we
will need to free them up at some point) so we can point in time recover to
any point from 28 hours before to several hours after the problem surfaced
if that's helpful at all.

Bugs / Questions:
 - the members files wrapped over themselves leading to corruption
 - why didn't vacuum_db -a -F free up more members files?

pgsql-bugs by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: src/port/getopt_long.c lossy with arguments having no option characters
Next
From: Michael Paquier
Date:
Subject: Re: PQexec() hangs on OOM