leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume - Mailing list pgsql-hackers

From Palle Girgensohn
Subject leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume
Date
Msg-id 513FCA39.7030709@FreeBSD.org
Whole thread Raw
Responses Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Running postgresql-9.2.2 on FreeBSD 9.1 using vanilla ufs file system.

I have the postgresql base/ on the /usr disk, and a separate volume /opt
where the default tablespace resides. I found many databases that had
not used the default tablespace, an ran many hundreds of

ALTER TABLE foobar SET TABLESPACE foo;
ALTER INDEX foobar_idx SET TABLESPACE foo;

a while back to free up disk space on /usr. I got lots of space freed
up, but it seems that after that the disk usage grows linearly (it seems
to leave many inodes unreferenced).

The the discrepancy between df and du is now huge:

# du -sxh /usr; df -h /usr
4,6G    /usr
Filesystem     Size    Used   Avail Capacity  Mounted on
/dev/da0s1f    104G     88G    8.0G    92%    /usr

4,6G vs 88GB, that must be more than a rounding error?

Strange thing is I cannot find any open files.

# lsof /usr| awk '{print $9}'|xargs ls -l > /dev/null

returns no errors (a missing file would render an error with ls). If
there where open files not referenced in any directory, they should be
found.

Next thing is fsck, and yes, there are plenty of unreferenced files.

I ran fsck while system is running (i.e. read only) to get a grip oif
the amount of lost inodes:

fsck /usr | awk '{print $1}'|cut -f 2 -d=| perl -e '$i = 0; while (<>) {
$i += $_;}; print $i / 1024 / 1024; print "\n";'
85223.3530330658

~85 GB gone, that's 80% of the disk, and it accounts fo all the missing
space.

MTIME for the inodes are pretty evenly spread over time since the
machine was updated to FreeBSD 9.1, rebooted, and PostgreSQL was updated
to 9.2. All was done at the same time, so I can't really tell who's to
blaim, but this is the only server out of a dozen where we did all these
ALTER table/index SET TABLEPSPACE... and it is the only server, out of a
dozen that where updated to exactly the same versions, that has this
problem. All other servers have their base/ disk as horisontal the
horizone (since all data resides on a separate tablespace). All servers
where updated from 8.4 or 9.0 using pg_update. Only this server has
problems. This is the only server with a multitude (~30) active
databases, and the only one where we ran the SET TABLESPACE stuff.

The unreferenced inodes are almost exclusively around 16 MB in size, so
i.e. they would most probably all be pg_xlog files.

config in postgresql.conf is checkpoint_segments = 40

Could there be anything fishy in the code that moves the indexes or
tables between different tablespaces? How could it possibly be able to
leave unreferenced inodes around like this? Is the culprit a combination
of postgresql and file system code?

checkpoints seems to happen approximately every three minutes:

Mar 13 00:39:08 dbserver postgres[5298]: [48-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:41:38 dbserver postgres[5298]: [49-1] db=,user= LOG:
checkpoint complete: wrote 2542 buffers (0.3%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=149.667 s, sync=0.101 s,
total=149.770 s; sync files=628, longest=0.021 s, average=0.000 s
Mar 13 00:44:08 dbserver postgres[5298]: [50-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:46:38 dbserver postgres[5298]: [51-1] db=,user= LOG:
checkpoint complete: wrote 3996 buffers (0.4%); 0 transaction log
file(s) added, 0 removed, 1 recycled; write=149.438 s, sync=0.111 s,
total=149.551 s; sync files=823, longest=0.006 s, average=0.000 s
Mar 13 00:49:08 dbserver postgres[5298]: [52-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:51:38 dbserver postgres[5298]: [53-1] db=,user= LOG:
checkpoint complete: wrote 13736 buffers (1.4%); 0 transaction log
file(s) added, 0 removed, 2 recycled; write=149.958 s, sync=0.311 s,
total=150.271 s; sync files=1335, longest=0.079 s, average=0.000 s
Mar 13 00:54:08 dbserver postgres[5298]: [54-1] db=,user= LOG:
checkpoint starting: time
Mar 13 00:56:38 dbserver postgres[5298]: [55-1] db=,user= LOG:
checkpoint complete: wrote 14638 buffers (1.5%); 0 transaction log
file(s) added, 0 removed, 17 recycled; write=149.330 s, sync=0.271 s,
total=149.603 s; sync files=1363, longest=0.017 s, average=0.000 s
Mar 13 00:59:08 dbserver postgres[5298]: [56-1] db=,user= LOG:
checkpoint starting: time
Mar 13 01:01:38 dbserver postgres[5298]: [57-1] db=,user= LOG:
checkpoint complete: wrote 8035 buffers (0.8%); 0 transaction log
file(s) added, 0 removed, 21 recycled; write=149.285 s, sync=0.146 s,
total=149.433 s; sync files=1160, longest=0.003 s, average=0.000 s
Mar 13 01:04:08 dbserver postgres[5298]: [58-1] db=,user= LOG:
checkpoint starting: time
Mar 13 01:06:37 dbserver postgres[5298]: [59-1] db=,user= LOG:
checkpoint complete: wrote 2156 buffers (0.2%); 0 transaction log
file(s) added, 0 removed, 9 recycled; write=149.402 s, sync=0.057 s,
total=149.461 s; sync files=610, longest=0.000 s, average=0.000 s
Mar 13 01:09:08 dbserver postgres[5298]: [60-1] db=,user= LOG:
checkpoint starting: time


I'm pretty certain that unmounting the file system and running fsck will
regain the lost space, but will it stop there? Has it got to do with the
ALTER table/index SET TABLESPACE, or is that a smoke screen, and it is
really something else?

Stopping postgresql briefly did not help, I tried that.

The server has about 30 databases and ~127 concurrent connections (not
all beeing active simultaneously, though), so it is fair to say it is
pretty active, but nothing extreme.

Hardware is HP DL360, using their HT Smart Array P410i.

Any ideas how to debug this? Or shall I just reboot, fsck, hope the
problem will go away, and when it does, forget about it?

Thanks,
Palle
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJRP8o5AAoJEIhV+7FrxBJDTl0H/2JWb3lFuaL+QYDe9+Le8zhu
7Q1KEYeY91Noq88rLa+eM3ENPibrsJYGYWYaQb9U2PSFXvud5/+czWMZrcVx+3H4
+tFME+FO+f9kasNrGA++LmRwlXlSWYZj9oZXumdIbmqX0Nvne0nvMc34Qyg8yqB5
l3pMFRdMEO4kM9n/pJkQf/7nuWZDEvmNTlkL1wARjGyTa8GYQa1a4pBDq/ovXuE8
+6NF64X8Mbm55geEHIybop72BTcfQmGMpYaeInjH2V2zbaqC6HV3eqfTv+CGT8op
tAxRYKa/9brYj9IIcaLn+zQvBTxbklZVON6BrlKI6+z144nXOdH+Ny4Zs4upP+Y=
=J6T7
-----END PGP SIGNATURE-----



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Display output file name in psql prompt?
Next
From: Tom Lane
Date:
Subject: Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume