Thread: Growing Database Size

Growing Database Size

From
Jason Godden
Date:
Hi All,

I've got a bit of a dilemma with growing database sizes.  Before upgrading to
7.3.2 my 7.3.0 database had an average size of 1.4-2.0gb.  I have a large
number of LO records in my db and presumed the size growth was because of
them.  However I do have the appropriate trigger functions on my LO records
to ensure that lo_manage runs to unlink the old reference etc.

Since moving to 7.3.2, even though I VACUUM regularly, my DB size has
increased to 3.2GB on disk!!  The raw data only consumes about 700-900mb so I
feel that an additional 2.2gb for index and statistics info is a bit much!
Any clues from the experienced among the group?

PS - strange thing is that even with that huge change on disk PG's performance
hasn't really been affected... weird.

Thanks,

Jason



Re: Growing Database Size

From
Bruno Wolff III
Date:
On Sat, Jun 14, 2003 at 16:35:16 +1000,
  Jason Godden <jasongodden@optushome.com.au> wrote:
> Hi All,
>
> I've got a bit of a dilemma with growing database sizes.  Before upgrading to
> 7.3.2 my 7.3.0 database had an average size of 1.4-2.0gb.  I have a large
> number of LO records in my db and presumed the size growth was because of
> them.  However I do have the appropriate trigger functions on my LO records
> to ensure that lo_manage runs to unlink the old reference etc.
>
> Since moving to 7.3.2, even though I VACUUM regularly, my DB size has
> increased to 3.2GB on disk!!  The raw data only consumes about 700-900mb so I
> feel that an additional 2.2gb for index and statistics info is a bit much!
> Any clues from the experienced among the group?
>
> PS - strange thing is that even with that huge change on disk PG's performance

You probably want to use a reindex. If you have an index on a column
that only increases or decreases over time, the index won't be able
to use reuse pages that are no longer in use. This is fixed in 7.4.

Re: Growing Database Size

From
Kenneth Godee
Date:
>
> You probably want to use a reindex. If you have an index on a column
> that only increases or decreases over time, the index won't be able
> to use reuse pages that are no longer in use. This is fixed in 7.4.
>

I'm kinda having the same type of problem, I'm updating a entire
database nightly with.....

BEGIN
DELETE (deleting all rows from a table)
DELETE (deleting all rows from a table)
COPY (populating with new rows)
COPY (populating with new rows)
COMMIT
VACUUM (analyze)

Since I don't fully understand the whole "FSM" settings and
can find "NO" doc's or "ANY" info on how to tweak these settings
besides an email now and then saying to up them. Not only that but
(as of a few days ago) there seems to be no way to search prior posts
on this topic. The archives seem to be off line, last time I tried it seems
my IP address wasn't in the "pg_hba.conf" file and google SITE: searchs don't work
either, "The page can not be found" errors.
So my database just grows and grows and added to my monthly list of todos, I take the
database off line into single user mode and reindex and vacuum full.

Am I missing something? Does anyone know where to find any doc's on how to
adjust the FSM settings?
I usally try to do my home work before asking these questions, but.....

Any help in understanding this process would be great, I think it would help others too,

Have you checked the size of your database lately?

Or is all this stuff updated in the newer versions, I'm running 7.2.1.









Re: Growing Database Size

From
"Nigel J. Andrews"
Date:
On Sat, 14 Jun 2003, Kenneth Godee wrote:

> >
> > You probably want to use a reindex. If you have an index on a column
> > that only increases or decreases over time, the index won't be able
> > to use reuse pages that are no longer in use. This is fixed in 7.4.
> >
>
> I'm kinda having the same type of problem, I'm updating a entire
> database nightly with.....
>
> BEGIN
> DELETE (deleting all rows from a table)
> DELETE (deleting all rows from a table)
> COPY (populating with new rows)
> COPY (populating with new rows)
> COMMIT
> VACUUM (analyze)
>
> Since I don't fully understand the whole "FSM" settings and
> can find "NO" doc's or "ANY" info on how to tweak these settings
> besides an email now and then saying to up them. Not only that but
> (as of a few days ago) there seems to be no way to search prior posts
> on this topic. The archives seem to be off line, last time I tried it seems
> my IP address wasn't in the "pg_hba.conf" file and google SITE: searchs don't work
> either, "The page can not be found" errors.
> So my database just grows and grows and added to my monthly list of todos, I take the
> database off line into single user mode and reindex and vacuum full.
>
> Am I missing something? Does anyone know where to find any doc's on how to
> adjust the FSM settings?
> I usally try to do my home work before asking these questions, but.....

Well is that a vacuum or a vacuum analyse you're doing after that nightly
delete/copy sequence? You need to run vacuum to stand any chance of your fsm
settings actually helping you since the fsm is for tracking freed space in the
file store and vacuum is what does that scan. At least I don't think vacuum
analyse does but I may well be mistaken on that (pretty basic and I really
should look at those details some time).

The copy's are just going to be sticking every tuple after the space for all
the delete tuples without that. Also, why not use a delete, vacuum full, copy
sequence? I see you're doing this in a transaction so you probably don't want
the alternative of truncate, copy which I think, but don't know, will truncate
the file store.

Also, is there some reason you're not wanting to run vacuum full more
frequently?


>
> Any help in understanding this process would be great, I think it would help others too,

FSM stands for Free Space Map. The space taken by deleted tuples is
automatically reused unless that space is recorded in the fsm. Vacuum fills in
the fsm and vacuum full moves undeleted tuples into space taken by deleted
tuples (without using the fsm).

> Have you checked the size of your database lately?
>
> Or is all this stuff updated in the newer versions, I'm running 7.2.1.

Upgrade to 7.2.4, it should be painless and it's highly recommended.

Well if you're not reusing the same range of indexed values for each new load
of data I would expect you'll find the disk usage increasing for those
indexes. There isn't a way to avoid the periodic reindex if you want to
maintain reasonable disk usage in that case. I'm not sure if that's different
for 7.3 or 7.4.


--
Nigel J. Andrews



Re: Growing Database Size

From
"Shridhar Daithankar"
Date:
On 14 Jun 2003 at 13:50, Kenneth Godee wrote:

> >
> > You probably want to use a reindex. If you have an index on a column
> > that only increases or decreases over time, the index won't be able
> > to use reuse pages that are no longer in use. This is fixed in 7.4.
> >
>
> I'm kinda having the same type of problem, I'm updating a entire
> database nightly with.....
>
> BEGIN
> DELETE (deleting all rows from a table)
> DELETE (deleting all rows from a table)
> COPY (populating with new rows)
> COPY (populating with new rows)
> COMMIT
> VACUUM (analyze)

Wow.. That would really kill the things.

How about

begin
drop table a;
drop table b;
create table a;
create table b;
copy -> table a;
copy ->table b;
commit;
vacuum analyze;

I can almost bet that this would be faster than solution above as it would not
churn the disk as much, put simply.

Hmm..Just wondering, will truncate in above steps would do the same job?

> Am I missing something? Does anyone know where to find any doc's on how to
> adjust the FSM settings?
> I usally try to do my home work before asking these questions, but.....

This is out of my head so take it with a grain of salt. Correct me if I am
wrong.

no. of FSM relations = number of relations/tables in database that gets
populated
no. of fsm pages in 8KB blocks = space required to store each deleted page. So
if you delete/update 100K pages, for each page you need to give some 4 byte
entry in FSM.

Unless you have big enough FSM map, vacuum won't be as effective as it won't be
able to track all the space wasted.

 HTH

Bye
 Shridhar

--
Peers's Law:    The solution to a problem changes the nature of the problem.


Re: Growing Database Size

From
"Jim C. Nasby"
Date:
On Sun, Jun 15, 2003 at 04:05:59PM +0530, Shridhar Daithankar wrote:
> On 14 Jun 2003 at 13:50, Kenneth Godee wrote:
> How about
>
> begin
> drop table a;
> drop table b;
> create table a;
> create table b;
> copy -> table a;
> copy ->table b;
> commit;
> vacuum analyze;
>
> I can almost bet that this would be faster than solution above as it would not
> churn the disk as much, put simply.
>
> Hmm..Just wondering, will truncate in above steps would do the same job?

Yes, truncate table would do the same job and more importantly, wouldn't
go mucking about in the catalog tables. Assuming it's available in 7.2,
you should absolutely use truncate instead of drop/create (man I wish
more databases supported truncate).
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: Growing Database Size

From
Tom Lane
Date:
"Jim C. Nasby" <jim@nasby.net> writes:
> On Sun, Jun 15, 2003 at 04:05:59PM +0530, Shridhar Daithankar wrote:
>> On 14 Jun 2003 at 13:50, Kenneth Godee wrote:
>> begin
>> drop table a;
>> drop table b;
>> create table a;
>> create table b;
>> copy -> table a;
>> copy ->table b;
>> commit;
>> vacuum analyze;

>> Hmm..Just wondering, will truncate in above steps would do the same job?

> Yes, truncate table would do the same job and more importantly, wouldn't
> go mucking about in the catalog tables. Assuming it's available in 7.2,
> you should absolutely use truncate instead of drop/create (man I wish
> more databases supported truncate).

However, the originally given sequence is transaction-safe (ie, on
failure it will roll back to the original state), whereas
something involving TRUNCATE will not be.  Until 7.4 anyway.

Shridhar's drop/recreate is transaction-safe and looks like a good
solution if you're not worried about foreign key linkages and such.

            regards, tom lane