Thread: "Leaking" disk space on FreeBSD servers

"Leaking" disk space on FreeBSD servers

From
Dan Thomas
Date:
Hi Guys,

We're seeing a problem with some of our FreeBSD/PostgreSQL servers "leaking" quite significant amounts of disk space:

    > df -h /usr/local/pgsql/
    Filesystem       Size    Used   Avail Capacity  Mounted on
    /dev/mfid1s1d    1.1T    772G    222G    78%    /usr/local/pgsql

    > du -sh /usr/local/pgsql/
    741G    /usr/local/pgsql/

Stopping Postgres doesn't fix it, but rebooting does which points at the OS rather than PG to me. However, the leak is only apparent in the dedicated pgsql partition, and only on our database servers, so PostgreSQL seems to at least be involved. The partition itself is a relatively standard UFS partition:

    > grep /usr/local/pgsql /etc/fstab
    /dev/mfid1s1d   /usr/local/pgsql    ufs   rw   2   2

    > tunefs -p /usr/local/pgsql/
    tunefs: POSIX.1e ACLs: (-a)                                disabled
    tunefs: NFSv4 ACLs: (-N)                                   disabled
    tunefs: MAC multilabel: (-l)                               disabled
    tunefs: soft updates: (-n)                                 enabled
    tunefs: gjournal: (-J)                                     disabled
    tunefs: trim: (-t)                                         disabled
    tunefs: maximum blocks per file in a cylinder group: (-e)  2048
    tunefs: average file size: (-f)                            16384
    tunefs: average number of files in a directory: (-s)       64
    tunefs: minimum percentage of free space: (-m)             8%
    tunefs: optimization preference: (-o)                      time
    tunefs: volume label: (-L)                                

LSOF isn't showing any open files:

    > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
    0

We're not creating filesystem snapshots:

    > find /usr/local/pgsql/ -flags snapshot
    >

Not all of our servers are leaking space, it's only the more recently-installed systems. Here's a quick breakdown of versions:

    FreeBSD   PostgreSQL   Leaking?
    8.0       8.4.4        no
    8.2       9.0.4        no
    8.3       9.1.4        yes
    8.3       9.2.3        yes
    9.1       9.2.3        yes

Each of these servers is configured with a warm standby, so we've been switching them over to the standby to reclaim the space (rebooting the primary is too much downtime). The standby does *not* demonstrate this problem while it's being used as a standby, but it starts leaking space once it's been made the primary.

Initially I thought this might be related to WAL files, however the pg_xlog dir is symlinked outside of the /usr/local/pgsql partition that is demonstrating this problem:

    > ll /usr/local/pgsql/data/pg_xlog    
    lrwxr-xr-x 25B Oct 19 10:48 pg_xlog -> /usr/local/pglog/pg_xlog/

I've exhausted everything I can think of to try to solve this one. Has anyone got any ideas on how to go about debugging this?

Thanks,

Dan

Re: "Leaking" disk space on FreeBSD servers

From
Achilleas Mantzios
Date:

Did you do a detailed du during the supposed problem and after the reboot and make a diff of those

to fimd any invlolved files/dirs?

That said, i think you might consider posting on freebsd-[questions|stable] as well.

 

On Τετ 20 Μαρ 2013 11:49:07 Dan Thomas wrote:

Hi Guys,

We're seeing a problem with some of our FreeBSD/PostgreSQL servers "leaking" quite significant amounts of disk space:

    > df -h /usr/local/pgsql/
    Filesystem       Size    Used   Avail Capacity  Mounted on
    /dev/mfid1s1d    1.1T    772G    222G    78%    /usr/local/pgsql

    > du -sh /usr/local/pgsql/
    741G    /usr/local/pgsql/

Stopping Postgres doesn't fix it, but rebooting does which points at the OS rather than PG to me. However, the leak is only apparent in the dedicated pgsql partition, and only on our database servers, so PostgreSQL seems to at least be involved. The partition itself is a relatively standard UFS partition:


    > grep /usr/local/pgsql /etc/fstab
    /dev/mfid1s1d   /usr/local/pgsql    ufs   rw   2   2

    > tunefs -p /usr/local/pgsql/
    tunefs: POSIX.1e ACLs: (-a)                                disabled
    tunefs: NFSv4 ACLs: (-N)                                   disabled
    tunefs: MAC multilabel: (-l)                               disabled
    tunefs: soft updates: (-n)                                 enabled
    tunefs: gjournal: (-J)                                     disabled
    tunefs: trim: (-t)                                         disabled
    tunefs: maximum blocks per file in a cylinder group: (-e)  2048
    tunefs: average file size: (-f)                            16384
    tunefs: average number of files in a directory: (-s)       64
    tunefs: minimum percentage of free space: (-m)             8%
    tunefs: optimization preference: (-o)                      time
    tunefs: volume label: (-L)                                

LSOF isn't showing any open files:

    > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
    0

We're not creating filesystem snapshots:

    > find /usr/local/pgsql/ -flags snapshot
    >

Not all of our servers are leaking space, it's only the more recently-installed systems. Here's a quick breakdown of versions:

    FreeBSD   PostgreSQL   Leaking?
    8.0       8.4.4        no
    8.2       9.0.4        no
    8.3       9.1.4        yes
    8.3       9.2.3        yes
    9.1       9.2.3        yes

Each of these servers is configured with a warm standby, so we've been switching them over to the standby to reclaim the space (rebooting the primary is too much downtime). The standby does *not* demonstrate this problem while it's being used as a standby, but it starts leaking space once it's been made the primary.

Initially I thought this might be related to WAL files, however the pg_xlog dir is symlinked outside of the /usr/local/pgsql partition that is demonstrating this problem:

    > ll /usr/local/pgsql/data/pg_xlog    
    lrwxr-xr-x 25B Oct 19 10:48 pg_xlog -> /usr/local/pglog/pg_xlog/

I've exhausted everything I can think of to try to solve this one. Has anyone got any ideas on how to go about debugging this?

Thanks,

Dan



-

Achilleas Mantzios

IT DEV

IT DEPT

Dynacom Tankers Mgmt

Re: "Leaking" disk space on FreeBSD servers

From
Dan Thomas
Date:
> Did you do a detailed du during the supposed problem and after the reboot and make a diff of those to fimd any
invlolvedfiles/dirs? 

du doesn't show the space in question (du -s shows the actual usage on
disk, df is showing a much higher number), so I doubt this will show
anything up. However, next reboot I'll certainly do that.

> That said, i think you might consider posting on freebsd-[questions|stable] as well.

Yes I think that might be a good plan :)

Dan

On 20 March 2013 12:30, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
> Did you do a detailed du during the supposed problem and after the reboot
> and make a diff of those
>
> to fimd any invlolved files/dirs?
>
> That said, i think you might consider posting on freebsd-[questions|stable]
> as well.
>
>
>
> On Τετ 20 Μαρ 2013 11:49:07 Dan Thomas wrote:
>
> Hi Guys,
>
> We're seeing a problem with some of our FreeBSD/PostgreSQL servers "leaking"
> quite significant amounts of disk space:
>
>     > df -h /usr/local/pgsql/
>     Filesystem       Size    Used   Avail Capacity  Mounted on
>     /dev/mfid1s1d    1.1T    772G    222G    78%    /usr/local/pgsql
>
>     > du -sh /usr/local/pgsql/
>     741G    /usr/local/pgsql/
>
> Stopping Postgres doesn't fix it, but rebooting does which points at the OS
> rather than PG to me. However, the leak is only apparent in the dedicated
> pgsql partition, and only on our database servers, so PostgreSQL seems to at
> least be involved. The partition itself is a relatively standard UFS
> partition:
>
>
>     > grep /usr/local/pgsql /etc/fstab
>     /dev/mfid1s1d   /usr/local/pgsql    ufs   rw   2   2
>
>     > tunefs -p /usr/local/pgsql/
>     tunefs: POSIX.1e ACLs: (-a)                                disabled
>     tunefs: NFSv4 ACLs: (-N)                                   disabled
>     tunefs: MAC multilabel: (-l)                               disabled
>     tunefs: soft updates: (-n)                                 enabled
>     tunefs: gjournal: (-J)                                     disabled
>     tunefs: trim: (-t)                                         disabled
>     tunefs: maximum blocks per file in a cylinder group: (-e)  2048
>     tunefs: average file size: (-f)                            16384
>     tunefs: average number of files in a directory: (-s)       64
>     tunefs: minimum percentage of free space: (-m)             8%
>     tunefs: optimization preference: (-o)                      time
>     tunefs: volume label: (-L)
>
> LSOF isn't showing any open files:
>
>     > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
>     0
>
> We're not creating filesystem snapshots:
>
>     > find /usr/local/pgsql/ -flags snapshot
>     >
>
> Not all of our servers are leaking space, it's only the more
> recently-installed systems. Here's a quick breakdown of versions:
>
>     FreeBSD   PostgreSQL   Leaking?
>     8.0       8.4.4        no
>     8.2       9.0.4        no
>     8.3       9.1.4        yes
>     8.3       9.2.3        yes
>     9.1       9.2.3        yes
>
> Each of these servers is configured with a warm standby, so we've been
> switching them over to the standby to reclaim the space (rebooting the
> primary is too much downtime). The standby does *not* demonstrate this
> problem while it's being used as a standby, but it starts leaking space once
> it's been made the primary.
>
> Initially I thought this might be related to WAL files, however the pg_xlog
> dir is symlinked outside of the /usr/local/pgsql partition that is
> demonstrating this problem:
>
>     > ll /usr/local/pgsql/data/pg_xlog
>     lrwxr-xr-x 25B Oct 19 10:48 pg_xlog -> /usr/local/pglog/pg_xlog/
>
> I've exhausted everything I can think of to try to solve this one. Has
> anyone got any ideas on how to go about debugging this?
>
> Thanks,
>
> Dan
>
>
>
> -
>
> Achilleas Mantzios
>
> IT DEV
>
> IT DEPT
>
> Dynacom Tankers Mgmt


Re: "Leaking" disk space on FreeBSD servers

From
Achilleas Mantzios
Date:
On Τετ 20 Μαρ 2013 12:47:39 Dan Thomas wrote:
> > Did you do a detailed du during the supposed problem and after the reboot and make a diff of those to fimd any
invlolvedfiles/dirs? 
>
> du doesn't show the space in question (du -s shows the actual usage on
> disk, df is showing a much higher number), so I doubt this will show
> anything up. However, next reboot I'll certainly do that.

du (without -s)  shows the whole hierarchy, du -s behaves like du -d 0,
so at this point diff the output of (plain) su is definitely somth worth doing.

>
> > That said, i think you might consider posting on freebsd-[questions|stable] as well.
>
> Yes I think that might be a good plan :)
>
> Dan
>
> On 20 March 2013 12:30, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
> > Did you do a detailed du during the supposed problem and after the reboot
> > and make a diff of those
> >
> > to fimd any invlolved files/dirs?
> >
> > That said, i think you might consider posting on freebsd-[questions|stable]
> > as well.
> >
> >
> >
> > On Τετ 20 Μαρ 2013 11:49:07 Dan Thomas wrote:
> >
> > Hi Guys,
> >
> > We're seeing a problem with some of our FreeBSD/PostgreSQL servers "leaking"
> > quite significant amounts of disk space:
> >
> >     > df -h /usr/local/pgsql/
> >     Filesystem       Size    Used   Avail Capacity  Mounted on
> >     /dev/mfid1s1d    1.1T    772G    222G    78%    /usr/local/pgsql
> >
> >     > du -sh /usr/local/pgsql/
> >     741G    /usr/local/pgsql/
> >
> > Stopping Postgres doesn't fix it, but rebooting does which points at the OS
> > rather than PG to me. However, the leak is only apparent in the dedicated
> > pgsql partition, and only on our database servers, so PostgreSQL seems to at
> > least be involved. The partition itself is a relatively standard UFS
> > partition:
> >
> >
> >     > grep /usr/local/pgsql /etc/fstab
> >     /dev/mfid1s1d   /usr/local/pgsql    ufs   rw   2   2
> >
> >     > tunefs -p /usr/local/pgsql/
> >     tunefs: POSIX.1e ACLs: (-a)                                disabled
> >     tunefs: NFSv4 ACLs: (-N)                                   disabled
> >     tunefs: MAC multilabel: (-l)                               disabled
> >     tunefs: soft updates: (-n)                                 enabled
> >     tunefs: gjournal: (-J)                                     disabled
> >     tunefs: trim: (-t)                                         disabled
> >     tunefs: maximum blocks per file in a cylinder group: (-e)  2048
> >     tunefs: average file size: (-f)                            16384
> >     tunefs: average number of files in a directory: (-s)       64
> >     tunefs: minimum percentage of free space: (-m)             8%
> >     tunefs: optimization preference: (-o)                      time
> >     tunefs: volume label: (-L)
> >
> > LSOF isn't showing any open files:
> >
> >     > lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
> >     0
> >
> > We're not creating filesystem snapshots:
> >
> >     > find /usr/local/pgsql/ -flags snapshot
> >     >
> >
> > Not all of our servers are leaking space, it's only the more
> > recently-installed systems. Here's a quick breakdown of versions:
> >
> >     FreeBSD   PostgreSQL   Leaking?
> >     8.0       8.4.4        no
> >     8.2       9.0.4        no
> >     8.3       9.1.4        yes
> >     8.3       9.2.3        yes
> >     9.1       9.2.3        yes
> >
> > Each of these servers is configured with a warm standby, so we've been
> > switching them over to the standby to reclaim the space (rebooting the
> > primary is too much downtime). The standby does *not* demonstrate this
> > problem while it's being used as a standby, but it starts leaking space once
> > it's been made the primary.
> >
> > Initially I thought this might be related to WAL files, however the pg_xlog
> > dir is symlinked outside of the /usr/local/pgsql partition that is
> > demonstrating this problem:
> >
> >     > ll /usr/local/pgsql/data/pg_xlog
> >     lrwxr-xr-x 25B Oct 19 10:48 pg_xlog -> /usr/local/pglog/pg_xlog/
> >
> > I've exhausted everything I can think of to try to solve this one. Has
> > anyone got any ideas on how to go about debugging this?
> >
> > Thanks,
> >
> > Dan
> >
> >
> >
> > -
> >
> > Achilleas Mantzios
> >
> > IT DEV
> >
> > IT DEPT
> >
> > Dynacom Tankers Mgmt
>
>
>
-
Achilleas Mantzios
IT DEV
IT DEPT
Dynacom Tankers Mgmt


Re: "Leaking" disk space on FreeBSD servers

From
Vick Khera
Date:

On Wed, Mar 20, 2013 at 7:49 AM, Dan Thomas <godders@gmail.com> wrote:
Not all of our servers are leaking space, it's only the more recently-installed systems. Here's a quick breakdown of versions:

FWIW, I do not observe this behavior. My database has very heavy write load, and old data is purged after it is aged about 7 months, so I do get lots of fragmentation. However, I do not have any disk space "phantom" loss.

How long does it take for you to accumulate this "leak"?  My first instinct is that you have unlinked files still referenced by some application. That is really the only way you get these discrepancies. lsof *should* have showed them to you.  Try fstat in case there's some bug in lsof.

Also, your tunefs output seems to be not from FreeBSD 9.1. Specifically, it is not emitting this line:

tunefs: soft update journaling: (-j)                       disabled

It is a very useful option to turn on for large file systems. I can recover a 6TB file system in about 5 seconds on a crash reboot with that on.



[root@d04]# ps axuw34214
USER    PID %CPU %MEM     VSZ    RSS TT  STAT STARTED    TIME COMMAND
pgsql 34214  0.0  0.5 5426964 154484  0- S    28Feb13 1:30.66 /usr/local/bin/postgres -D /u/data/postgres
[root@d04]# df -h /u/data
Filesystem          Size    Used   Avail Capacity  Mounted on
/dev/ufs/ramdisk    707G    137G    513G    21%    /u/data
[root@d04]# du -sh /u/data
137G /u/data
[root@d04]# uname -a
FreeBSD d04.m1e.net 9.1-RELEASE FreeBSD 9.1-RELEASE #1 r243808: Mon Dec  3 09:56:27 EST 2012     vivek@lorax.kcilink.com:/usr/obj/u/lorax1/usr9/src/sys/KCI64  amd64
[root@d04]# uptime
 9:50AM  up 74 days, 17:36, 1 user, load averages: 0.21, 0.18, 0.17
[root@d04]# psql --version
psql (PostgreSQL) 9.2.3
[root@d04]#

Re: "Leaking" disk space on FreeBSD servers

From
Achilleas Mantzios
Date:

regarding journaling, there is the counter argument that you do not need to do the same job twice,

in the sense that we already spend a considerable amount of time retaining the WAL in postgresql,

no need to redo the same on FS level.

"Crush"-intensive systems (for lack of a better word) might benefit from FS journaling, but the

best option here is try and find the cause. FreeBSD systems are supposed to not crush,

that's why ppl use them in the first place.

 

On Τετ 20 Μαρ 2013 10:11:58 Vick Khera wrote:


On Wed, Mar 20, 2013 at 7:49 AM, Dan Thomas <godders@gmail.com> wrote:

Not all of our servers are leaking space, it's only the more recently-installed systems. Here's a quick breakdown of versions:


FWIW, I do not observe this behavior. My database has very heavy write load, and old data is purged after it is aged about 7 months, so I do get lots of fragmentation. However, I do not have any disk space "phantom" loss.


How long does it take for you to accumulate this "leak"?  My first instinct is that you have unlinked files still referenced by some application. That is really the only way you get these discrepancies. lsof *should* have showed them to you.  Try fstat in case there's some bug in lsof.


Also, your tunefs output seems to be not from FreeBSD 9.1. Specifically, it is not emitting this line:


tunefs: soft update journaling: (-j)                       disabled


It is a very useful option to turn on for large file systems. I can recover a 6TB file system in about 5 seconds on a crash reboot with that on.




[root@d04]# ps axuw34214

USER    PID %CPU %MEM     VSZ    RSS TT  STAT STARTED    TIME COMMAND

pgsql 34214  0.0  0.5 5426964 154484  0- S    28Feb13 1:30.66 /usr/local/bin/postgres -D /u/data/postgres

[root@d04]# df -h /u/data

Filesystem          Size    Used   Avail Capacity  Mounted on

/dev/ufs/ramdisk    707G    137G    513G    21%    /u/data

[root@d04]# du -sh /u/data

137G /u/data

[root@d04]# uname -a

FreeBSD d04.m1e.net 9.1-RELEASE FreeBSD 9.1-RELEASE #1 r243808: Mon Dec  3 09:56:27 EST 2012     vivek@lorax.kcilink.com:/usr/obj/u/lorax1/usr9/src/sys/KCI64  amd64

[root@d04]# uptime

 9:50AM  up 74 days, 17:36, 1 user, load averages: 0.21, 0.18, 0.17

[root@d04]# psql --version

psql (PostgreSQL) 9.2.3

[root@d04]#



-

Achilleas Mantzios

IT DEV

IT DEPT

Dynacom Tankers Mgmt

Re: "Leaking" disk space on FreeBSD servers

From
Vick Khera
Date:

On Wed, Mar 20, 2013 at 10:34 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:

regarding journaling, there is the counter argument that you do not need to do the same job twice,

in the sense that we already spend a considerable amount of time retaining the WAL in postgresql,

no need to redo the same on FS level.


There's a difference in the file system integrity and the DB integrity.  PG will keep the DB integrity just fine without the file system journaling. The journaling just makes recovery from crash that much faster.  ie, running fsck on 6TB of disk storage takes a LONG time, sometimes hours, but with the journal enabled, it takes a few seconds.

Re: "Leaking" disk space on FreeBSD servers

From
Achilleas Mantzios
Date:

Of course, but does it make sense for you to pay the ~ 5%/day performance penalty for the ~0.5%/year chance of having your system crush?

Unless your FreeBSD server is stuffed with exotic gamer hardware, i don't see the likehood of crush getting larger than that.

 

On Τετ 20 Μαρ 2013 10:39:58 Vick Khera wrote:


On Wed, Mar 20, 2013 at 10:34 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:

regarding journaling, there is the counter argument that you do not need to do the same job twice,

in the sense that we already spend a considerable amount of time retaining the WAL in postgresql,

no need to redo the same on FS level.

 


There's a difference in the file system integrity and the DB integrity.  PG will keep the DB integrity just fine without the file system journaling. The journaling just makes recovery from crash that much faster.  ie, running fsck on 6TB of disk storage takes a LONG time, sometimes hours, but with the journal enabled, it takes a few seconds.




-

Achilleas Mantzios

IT DEV

IT DEPT

Dynacom Tankers Mgmt

Re: "Leaking" disk space on FreeBSD servers

From
Dan Thomas
Date:
> How long does it take for you to accumulate this "leak"?

It grows at between 2 and 4 gigabytes per day on average. It seems to
be related to load on the database, as it grows slower over the
weekends when the servers are under less load. Here's a graph that
shows growth of one server (from reboot to about ~30gb difference)
over the last couple of weeks:
http://i.imgur.com/AwzQ46j.png

The flatter parts are the weekends, but otherwise it's reasonably constant.

> That is really the only way you get these discrepancies. lsof *should* have showed them to you.
> Try fstat in case here's some bug in lsof.

After a bit of messing around with fstat and find (it doesn't make it
easy!) I've confirmed all the inodes fstat is reporting exist on disk,
which backs up lsof. I also confirmed both lsof and fstat were
detecting unlinked files by manually holding a file open and unlinking
it. Can't see much evidence that it's an open file.

> Also, your tunefs output seems to be not from FreeBSD 9.1

That example was from an 8.3 machine (it's the one I'm testing with as
it's got the biggest disk usage deficit). The 9.1 box has this:

    tunefs: soft update journaling: (-j)                       enabled

..but is still exhibiting this behaviour.

> FWIW, I do not observe this behaviour

We actually have another FreeBSD8.3/PG9.1 machine under different (but
similar) load that *doesn't* demonstrate this behaviour. There's
nothing obvious in the differences in usage patterns that we can see
(we're not using any exotic features or anything), but it certainly
suggests that it's *something* related to PG or our usage of it.

On 20 March 2013 14:11, Vick Khera <vivek@khera.org> wrote:
>
> On Wed, Mar 20, 2013 at 7:49 AM, Dan Thomas <godders@gmail.com> wrote:
>>
>> Not all of our servers are leaking space, it's only the more
>> recently-installed systems. Here's a quick breakdown of versions:
>
>
> FWIW, I do not observe this behavior. My database has very heavy write load,
> and old data is purged after it is aged about 7 months, so I do get lots of
> fragmentation. However, I do not have any disk space "phantom" loss.
>
> How long does it take for you to accumulate this "leak"?  My first instinct
> is that you have unlinked files still referenced by some application. That
> is really the only way you get these discrepancies. lsof *should* have
> showed them to you.  Try fstat in case there's some bug in lsof.
>
> Also, your tunefs output seems to be not from FreeBSD 9.1. Specifically, it
> is not emitting this line:
>
> tunefs: soft update journaling: (-j)                       disabled
>
> It is a very useful option to turn on for large file systems. I can recover
> a 6TB file system in about 5 seconds on a crash reboot with that on.
>
>
>
> [root@d04]# ps axuw34214
> USER    PID %CPU %MEM     VSZ    RSS TT  STAT STARTED    TIME COMMAND
> pgsql 34214  0.0  0.5 5426964 154484  0- S    28Feb13 1:30.66
> /usr/local/bin/postgres -D /u/data/postgres
> [root@d04]# df -h /u/data
> Filesystem          Size    Used   Avail Capacity  Mounted on
> /dev/ufs/ramdisk    707G    137G    513G    21%    /u/data
> [root@d04]# du -sh /u/data
> 137G /u/data
> [root@d04]# uname -a
> FreeBSD d04.m1e.net 9.1-RELEASE FreeBSD 9.1-RELEASE #1 r243808: Mon Dec  3
> 09:56:27 EST 2012
> vivek@lorax.kcilink.com:/usr/obj/u/lorax1/usr9/src/sys/KCI64  amd64
> [root@d04]# uptime
>  9:50AM  up 74 days, 17:36, 1 user, load averages: 0.21, 0.18, 0.17
> [root@d04]# psql --version
> psql (PostgreSQL) 9.2.3
> [root@d04]#


Re: "Leaking" disk space on FreeBSD servers

From
Achilleas Mantzios
Date:
On Ôåô 20 Ìáñ 2013 15:15:23 Dan Thomas wrote:

>
> We actually have another FreeBSD8.3/PG9.1 machine under different (but
> similar) load that *doesn't* demonstrate this behaviour. There's
> nothing obvious in the differences in usage patterns that we can see
> (we're not using any exotic features or anything), but it certainly
> suggests that it's *something* related to PG or our usage of it.
>

Any difference in the architecture of the two systems? (x86, amd64, etc..)
Any difference in the respective output of
% pg_config
?

>
>
>
-
Achilleas Mantzios
IT DEV
IT DEPT
Dynacom Tankers Mgmt


Re: "Leaking" disk space on FreeBSD servers

From
Dan Thomas
Date:
> Any difference in the architecture of the two systems? (x86, amd64, etc..)
> Any difference in the respective output of
> % pg_config

Alas, no. Both identical machines running identical versions of
FreeBSD and PG. pg_config on the two machines matches exactly.

On 20 March 2013 15:37, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
> On Ôåô 20 Ìáñ 2013 15:15:23 Dan Thomas wrote:
>
>>
>> We actually have another FreeBSD8.3/PG9.1 machine under different (but
>> similar) load that *doesn't* demonstrate this behaviour. There's
>> nothing obvious in the differences in usage patterns that we can see
>> (we're not using any exotic features or anything), but it certainly
>> suggests that it's *something* related to PG or our usage of it.
>>
>
> Any difference in the architecture of the two systems? (x86, amd64, etc..)
> Any difference in the respective output of
> % pg_config
> ?
>
>>
>>
>>
> -
> Achilleas Mantzios
> IT DEV
> IT DEPT
> Dynacom Tankers Mgmt
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: "Leaking" disk space on FreeBSD servers

From
Kevin Grittner
Date:
Dan Thomas <godders@gmail.com> wrote:

> We're seeing a problem with some of our FreeBSD/PostgreSQL
> servers "leaking" quite significant amounts of disk space:

> Stopping Postgres doesn't fix it, but rebooting does which points
> at the OS rather than PG to me. However, the leak is only
> apparent in the dedicated pgsql partition, and only on our
> database servers, so PostgreSQL seems to at least be involved.
> The partition itself is a relatively standard UFS partition:

I saw something once which *might* be related.  I don't recall the
OS of FS involved, but in an attempt to reduce the fragmentation of
files which started small and eventually grew large, a large
allocation of contiguous space was made on file creation, and that
space was not release as long as any page for the file remained in
the OS cache.  In the instance where I saw the problem, autovacuum
had been turned off and the instance was just coming up on the
point where wraparound prevention runs were about to be triggered.
pg_clog was where most of the wasted space was.

No guarantees that this is the issue, but it sounded similar....

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: "Leaking" disk space on FreeBSD servers

From
Shaun Thomas
Date:
On 03/20/2013 01:25 PM, Kevin Grittner wrote:

> I saw something once which *might* be related.  I don't recall the
> OS of FS involved, but in an attempt to reduce the fragmentation of
> files which started small and eventually grew large, a large
> allocation of contiguous space was made on file creation, and that
> space was not release as long as any page for the file remained in
> the OS cache.

That was an optimization decision made for XFS in recent kernels, and
the chunks it grabs are very, very large. We had to reduce the default
allocation size to 1MB to disable the elastic allocation system. In the
end, we regained about 50GB of "phantom" space after a re-mount, and
it's stayed that way since.

But that's what du --apparent-size is for. :)

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email