Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume - Mailing list pgsql-hackers

From Dan Thomas
Subject Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume
Date
Msg-id CAG8duQ0hACFEk2DhRpDQv30=FHUhew6j8UGeByrw-Fj1Z-cbow@mail.gmail.com
Whole thread Raw
In response to Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
We're seeing similar behaviour on several of our FreeBSD servers too.
It doesn't look like open files, or filesystem snapshots. Rebooting
does reset it, but restarting PG makes no difference.

We've got an assortment of different versions of both FreeBSD and
PostgreSQL, some of which are demonstrating this behaviour, some
aren't. Here's a quick breakdown of versions and what we've got
running:

FreeBSD   PostgreSQL   Leaking?
8.0       8.4.4        no
8.2       9.0.4        no
8.3       9.1.4        yes
8.3       9.2.3        yes
9.1       9.2.3        yes

All of these machines are under similar load patterns and (apart from
the version differences), are set up essentially the same and are
doing the same job. They all have hot standbys yet this problem
doesn't exist on any of the standby servers. We haven't done anything
with tablespaces, the database has its own dedicated partition
(although pg_log/pg_xlog are both symlinked out to /usr).

However (just to throw a spanner in the works) we do have another
server running fbsd8.3/pg9.1.4 which ISN'T showing this behaviour -
although its load patterns are quite different.

I'm not sure if this is going to help, but here's a graph of this disk
space disparity over the last few days (Y axis is in gigabytes). The
flat-ish part in the middle is the weekend where we have little
traffic, so we can at least say it's not constant:
http://i.imgur.com/jlbgzNI.png

Up until now we've been upgrading things in the hope that the problem
will go away, but since we've got one server up to fbsd9.1/pg9.2.3 and
still seeing the problem we're a little stumped. Any ideas about how
we can go about debugging this would be appreciated.

Thanks,

Dan

On 13 March 2013 07:39, Magnus Hagander <magnus@hagander.net> wrote:
>
> On Mar 13, 2013 3:04 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
>>
>> Palle Girgensohn <girgen@FreeBSD.org> writes:
>> > ... I got lots of space freed
>> > up, but it seems that after that the disk usage grows linearly (it seems
>> > to leave many inodes unreferenced).
>>
>> Hm.  We've seen issues in the past with PG processes failing to close
>> no-longer-useful files promptly, but ...
>>
>> > Strange thing is I cannot find any open files.
>>
>> ... that suggests there's something else going on.
>>
>> > The unreferenced inodes are almost exclusively around 16 MB in size, so
>> > i.e. they would most probably all be pg_xlog files.
>>
>> Have you got any sort of WAL archiving active, and if so maybe that's
>> holding onto WAL files?  Not that it's clear how come lsof wouldn't
>> tattle on an archiving process either.
>>
>> > Stopping postgresql briefly did not help, I tried that.
>>
>> That seems to point the finger at some non-postgres cause.  I confess
>> I can't guess what.
>>
>
> Yeah, unreferenced inodes with no open files, and only discoverable with
> fsck sounds like a filsystem bug to me. Particularly since it showed up just
> after a operating system upgrade, and doesn't go away with a postgres
> restart...
>
> /Magnus



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Writable foreign tables: how to identify rows
Next
From: Tom Lane
Date:
Subject: Re: Writable foreign tables: how to identify rows