Thread: Why do I have holes in my pages?

Why do I have holes in my pages?

From
Aleksey Tsalolikhin
Date:
Why do I have holes in my pages?

Postgres 8.4.12

 "select ctid from big_table" on my master shows that pages have
"holes" in them.

Here is example for page 431665:

 (431665,2)
 (431665,5)
 (431665,8)
 (431665,11)
 (431665,14)
 (431665,17)
 (431665,20)
 (431665,23)

Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).

So what happened to rows 1, 3 and 4 and so on?

I have to size a database server for next year's budget, and I will
have to explain to my mgmt why we have 400 GB taking up 2.7 TB.  Help?
 Would appreciate a pointer to the appropriate section in the manual
if this is documented.

Thanks,
-at


Re: Why do I have holes in my pages?

From
Bill Moran
Date:
In response to Aleksey Tsalolikhin <atsaloli.tech@gmail.com>:
>
> Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).

I expect that the first thing that others are going to ask
is "what is telling you that your DB is 400G?"

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/


Re: Why do I have holes in my pages?

From
Aleksey Tsalolikhin
Date:
On Thu, Sep 20, 2012 at 12:34 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> In response to Aleksey Tsalolikhin <atsaloli.tech@gmail.com>:
>>
>> Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).
>
> I expect that the first thing that others are going to ask
> is "what is telling you that your DB is 400G?"


Right on.  I got that out of my pgstatspack report.

\l+ in psql tells me the same thing - 400 GB


Re: Why do I have holes in my pages?

From
Victor Yegorov
Date:
Take a look at this part of the documentation:

The “missing” entries belong to the tuples that you have DELETEd/UPDATEd and that are no longer visible
to your current session, but still might be for the others, that started some time ago. When tuples are no longer
needed, VACUUM will “release” the slots by adding them into the FreeSpaceMap.

Still, if you have “empty” slots in the middle of your datafiles, VACUUM cannot resize files.
This leads to the fact that while database size is being not so big, actual disk space occupied by it
is bigger. This is called “bloat”.

Check the output of the query here:

Also, having such a big difference in the reported and actual size of the database, may I ask:
- when was the last time you performed VACUUM?
- don't you have autovacuum = on (which is default) in your configuration?


2012/9/20 Aleksey Tsalolikhin <atsaloli.tech@gmail.com>
On Thu, Sep 20, 2012 at 12:34 PM, Bill Moran <wmoran@potentialtech.com> wrote:
> In response to Aleksey Tsalolikhin <atsaloli.tech@gmail.com>:
>>
>> Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).
>
> I expect that the first thing that others are going to ask
> is "what is telling you that your DB is 400G?"


Right on.  I got that out of my pgstatspack report.

\l+ in psql tells me the same thing - 400 GB
 

--
Victor Y. Yegorov

Re: Why do I have holes in my pages?

From
John R Pierce
Date:
On 09/20/12 1:34 PM, Aleksey Tsalolikhin wrote:
> Right on.  I got that out of my pgstatspack report.
>
> \l+ in psql tells me the same thing - 400 GB

it might be interesting to see the output of...

     du -hs $PGDATA/*

(assuming this is a linux or similar unix system).....  This will show
how much space is being used by the various directories under the PG
data directory.    if pg_xlog is very large, you may have an issue with
wal archiving or something.   if pg_log is very large, you may have an
issue with A) too much being logged, and B) nothing cleaning up stale
log files.

how did you arrive at the 2.7TB number?    and what file system does
this 6.6TB volume use?




--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast



Re: Why do I have holes in my pages?

From
Aleksey Tsalolikhin
Date:
On Thu, Sep 20, 2012 at 1:53 PM, John R Pierce <pierce@hogranch.com> wrote:
> On 09/20/12 1:34 PM, Aleksey Tsalolikhin wrote:
>>
>> Right on.  I got that out of my pgstatspack report.
>>
>> \l+ in psql tells me the same thing - 400 GB
>
>
> it might be interesting to see the output of...
>
>     du -hs $PGDATA/*

Well, that was it!  Thanks, John!

2.3T    /data/backups
400G    /data/base

We store our pg_dumps on the same filesystem (they are copied off to
another server but we don't delete them) so it swelled the filesystem
size as reported by "df".

Sorry about that.  And thanks for the help!

Aleksey


Re: Why do I have holes in my pages?

From
Jeff Janes
Date:
On Thu, Sep 20, 2012 at 1:46 PM, Victor Yegorov <vyegorov@gmail.com> wrote:
> Take a look at this part of the documentation:
> http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY
>
> The “missing” entries belong to the tuples that you have DELETEd/UPDATEd and
> that are no longer visible
> to your current session, but still might be for the others, that started
> some time ago. When tuples are no longer
> needed, VACUUM will “release” the slots by adding them into the
> FreeSpaceMap.

My understanding is (which is not very deep on this topic) is that it
is a lot more complicated than that.

To start with, it can be as you say where the ctid and its tuple are
interesting to someone, but not to you.  But eventually the tuple is
not interesting to anyone, and its space can be reused.  But the ctid
is still needed (to inform stragglers that it's corresponding tuple is
not interesting, and in fact no longer exists, so move on, nothing to
see).  Then eventually even the ctid itself is not needed anymore even
for that purpose.

At that point the ctid can be re-used, but only if someone actually
wants a "new" ctid on that page.  An ordinary vacuum will not close up
the gaps on un-used ctids.  Only a vaccum full will do that.

The space used by these ctid gaps is not large, and as the OP
discovered, his wasted space was in fact happening outside of the
database itself.

Cheers,

Jeff


Re: Why do I have holes in my pages?

From
Victor Yegorov
Date:
2012/9/21 Jeff Janes <jeff.janes@gmail.com>
To start with, it can be as you say where the ctid and its tuple are
interesting to someone, but not to you.  But eventually the tuple is
not interesting to anyone, and its space can be reused.  But the ctid
is still needed (to inform stragglers that it's corresponding tuple is
not interesting, and in fact no longer exists, so move on, nothing to
see).  Then eventually even the ctid itself is not needed anymore even
for that purpose.

At that point the ctid can be re-used, but only if someone actually
wants a "new" ctid on that page.  An ordinary vacuum will not close up
the gaps on un-used ctids.  Only a vaccum full will do that.

In the “Routine Vacuuming” section of the documentation I read that:
   «The standard form of VACUUM removes dead row versions in tables and indexes and marks the space available for future reuse.»
and
   «In contrast, VACUUM FULL actively compacts tables by writing a complete new version of the table file with no dead space.»

What I wanted to say in the previous post was exactly this: vacuum will mark the space (or gap) as free, while no space will be “returned” to the OS, except if free pages are at the very end of the data file. I haven't mentioned vacuum full at all.

It seems that this also matches your explanation, correct me if I'm wrong.


--
Victor Y. Yegorov

Re: Why do I have holes in my pages?

From
Jeff Janes
Date:
On Fri, Sep 21, 2012 at 11:41 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
>
> It seems that this also matches your explanation, correct me if I'm wrong.

I think that the explanations do generally match.  But, just because
you observe that the ctid space has not been reused (like the OP did),
does not mean that the tuple-space has not been reused.  My
understanding is that tuple space is easier to reuse than ctid space
is.  Sometimes that distinction is important, and sometimes it is not.
 In general, doing "select ctid..." is a poor way of figuring out
where the space in your database is going.

Cheers,

Jeff


Re: Why do I have holes in my pages?

From
Aleksey Tsalolikhin
Date:
On Fri, Sep 21, 2012 at 12:45 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Fri, Sep 21, 2012 at 11:41 AM, Victor Yegorov <vyegorov@gmail.com> wrote:
>>
>> It seems that this also matches your explanation, correct me if I'm wrong.
>
>  In general, doing "select ctid..." is a poor way of figuring out
> where the space in your database is going.

Noted, thank you.

Aleksey


Re: Why do I have holes in my pages?

From
Sergey Konoplev
Date:
On Fri, Sep 21, 2012 at 11:30 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> At that point the ctid can be re-used, but only if someone actually
> wants a "new" ctid on that page.  An ordinary vacuum will not close up
> the gaps on un-used ctids.  Only a vaccum full will do that.

There are a couple of ways to do that except the vacuum full that
locks the table exclusively.

1. pg_reorg can re-organize tables on a postgres database without
locks. However it requires twice the space of the table size and might
lead to IO spikes.
2. pgcompactor a tables and indexes bloat reducing tool, without
locking also. It is slower than pg_reorg but does its job more gently.

>
> The space used by these ctid gaps is not large, and as the OP
> discovered, his wasted space was in fact happening outside of the
> database itself.
>
> Cheers,
>
> Jeff
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general



--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984