Re: [HACKERS] Orphaned files in base/[oid] - Mailing list pgsql-hackers
From | Chris Travers |
---|---|
Subject | Re: [HACKERS] Orphaned files in base/[oid] |
Date | |
Msg-id | CAN-RpxAAjKwCMb6iQ7uvOReZOXqTzPs3jT_3zo1xmWXBerWM7g@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Orphaned files in base/[oid] (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Orphaned files in base/[oid]
|
List | pgsql-hackers |
On Tue, Aug 15, 2017 at 3:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Chris Travers <chris.travers@adjust.com> writes:
> I wonder about a different solution. Would it be possible to special case
> vacuum to check for and remove (or just move to where they can be removed)
> files when vacuuming pg_class? At the point we are vacuuming pg_class, we
> ought to be able to know that a relfilenode shouldn't be used anymore,
> right?
I don't think so. It's not clear to me whether you have in mind "scan
pg_class, collect relfilenodes from all live tuples, then zap all files
not in that set" or "when removing a dead tuple, zap the relfilenode
it mentions". But neither one works. The first case has a race condition
against new pg_class entries. As for the second, the existence of a dead
tuple bearing relfilenode N isn't evidence that some other live tuple
can't have relfilenode N.
Ah because if the file never made it on to disk the number could be re-used.
Another problem for the second solution is that in the case you're worried
about (ie, PANIC due to out-of-WAL-space during relation's creating
transaction), there's no very good reason to expect that the relation's
pg_class tuple ever made it to disk at all.
A traditional low-tech answer to this has been to keep the WAL on a
separate volume from the main data store, so that it's protected from
out-of-space conditions in the main store and temp areas. The space
needs for WAL proper are generally much more predictable than the main
store, so it's easier to keep the dedicated space from overflowing.
(Stalled replication/archiving processes can be hazardous to your
health in this scenario, though, if they prevent prompt recycling of
WAL files.)
Yeah, most of our dbs here have wal on a separate volume but not this system. This system is also unusual in that disk usage varies wildly (and I am not 100% sure that this is the only case which causes it though I can reproduce it consistently in the case of the wal writer running out of disk space with symptoms exactly what I found).
So for now that leaves my fallback approach as a way to fix it when I see it.
I have written a shell script which does as follows:
1. starts Postgres in single user mode with a data directory or dies (won't run if Postgres seems to be already running)
2. gets the old of the current database
3. lists all files consisting of only digits in the base/[dboid] directory
4. asks Postgres (In single user mode again) for all relfilenodes and oids of tables (In my testing both were required because there were some cases where relfilenodes were not set in some system
5. Loops through the file nodes gathered, checks against the relfilenode entries, and zaps $f, $f_*, and $f.*. Currently for testing "zaps" has been to move to a lostnfound folder for inspection following the script. The logic here is not perfect and is very slightly under inclusive, but better that than the other way.
Then we can start Postgres again. I cannot find a better way to avoid race conditions, I guess. At any rate it sounds like preventing the problem more generally may be something beyond what I would feel comfortable trying to do as a patch at my current level of familiarity with he source code.
The full script is included inline below my signature in case it is of interest to anyone on the list.
regards, tom lane
Best Regards,
Chris Travers
Database Administrator
---
#!/bin/bash
datadir=$1
database=$2
pg_ctl -D $datadir stop
dboidfile="$PWD/cleanupdb.oid"
reloidfile="$PWD/refilenodes.list"
echo "COPY (select oid from pg_database where datname = current_database()) TO '$dboidfile'" | postgres --single -D $datadir $database > /dev/null
if (($?))
then
echo "FATAL: Could not start Postgres in single user mode"
exit 1
fi
dboid=`cat $dboidfile`
filenodes=`(cd test/base/$dboid; ls [0-9]*[0-9] | grep -v '\.' | sort -n)`
#echo $filenodes
echo "COPY (select relfilenode from pg_class union select oid as relfilenode from pg_class) TO '$reloidfile'" | postgres --single -D $datadir $database > /dev/null
relfilenodes=`cat $reloidfile`
#echo $relfilenodes
if [[ -z relfilenodes ]]
then
echo "FATAL: did not get any relfilenodes"
exit 2
fi
mkdir lostnfound;
for f in $filenodes
do
if [[ -z `echo $relfilenodes | grep -w $f` ]]
then
echo moving $f to lostnfound
mv $datadir/base/$dboid/$f lostnfound
mv $datadir/base/$dboid/${f}_* lostnfound 2> /dev/null
mv $datadir/base/$dboid/${f}.* lostnfound 2> /dev/null
fi
done
rm $dboidfile
rm $reloidfile
pgsql-hackers by date: