Thread: abnormal data grow

abnormal data grow

From

Reynard Hilman

Date:

21 September 2004, 15:49:54

Hi,

I have been having this problem where the database size suddenly grows
from the normal size of about 300Mb to 12Gb in one night.
When I look up the table size, the biggest one is only 41Mb and the
total of all table size is only 223Mb.
But in the filesystem data directory the total size is 12Gb. I noticed
there are 10 files with 1Gb size each:
1.1G    25677563
1.1G    25677563.1
1.1G    25677563.2
1.1G    25677563.3
1.1G    25677563.4
1.1G    25677563.5
1.1G    25677563.6
1.1G    25677563.7
1.1G    25677563.8
1.1G    25677563.9

So I tried to find out what table that is, but this query return nothing:
SELECT relname, relpages
    FROM pg_class
    WHERE relname = 'pg_toast_25677563' OR relname =
'pg_toast_25677563_index'
    ORDER BY relname;

and if I run this query:
SELECT relname, relpages
    FROM pg_class ORDER BY relpages desc

the top ones are these tables (which is not the id of the 1.1 Gb files)
pg_toast_25677561                                               1382845
pg_toast_25677561_index                                     22116

I'm just wondering is there a way to know what that 25677563 file is?
Why does postgres create a copy of that file with .1, .2, .3, etc. Is
that file still in used (because I can't find it in the pg_class table)?
Will postgres clean the database up or should I do a manual clean?

- reynard

Re: abnormal data grow

From

Tom Lane

Date:

21 September 2004, 16:05:29

Reynard Hilman <reynardmh@lightsky.com> writes:
> But in the filesystem data directory the total size is 12Gb. I noticed
> there are 10 files with 1Gb size each:
> 1.1G    25677563
> 1.1G    25677563.1
> ...

> I'm just wondering is there a way to know what that 25677563 file is?

select relname from pg_class where relfilenode = 25677563;

            regards, tom lane

Re: abnormal data grow

From

Doug McNaught

Date:

21 September 2004, 16:07:27

Reynard Hilman <reynardmh@lightsky.com> writes:

> Hi,
>
> I have been having this problem where the database size suddenly grows
> from the normal size of about 300Mb to 12Gb in one night.
> When I look up the table size, the biggest one is only 41Mb and the
> total of all table size is only 223Mb.
> But in the filesystem data directory the total size is 12Gb. I noticed
> there are 10 files with 1Gb size each:
> 1.1G    25677563
> 1.1G    25677563.1
> 1.1G    25677563.2
> 1.1G    25677563.3
> 1.1G    25677563.4
> 1.1G    25677563.5
> 1.1G    25677563.6
> 1.1G    25677563.7
> 1.1G    25677563.8
> 1.1G    25677563.9

25677563 is the OID of this object.  The different files are segments
of the relation, not duplicates (PG restricts individual data file
size to 1GB).  So look for which relation has that OID--it's quite
possibly an index that is bloating up.

-Doug
--
Let us cross over the river, and rest under the shade of the trees.
   --T. J. Jackson, 1863

Re: abnormal data grow

From

Alvaro Herrera

Date:

21 September 2004, 16:20:34

On Tue, Sep 21, 2004 at 09:51:15AM -0500, Reynard Hilman wrote:

> I'm just wondering is there a way to know what that 25677563 file is?
> Why does postgres create a copy of that file with .1, .2, .3, etc.

Those are not copies.  Postgres splits each relation (table/index) in 1GB
files.  So the relation actually uses 10 GB; you need some cleanup.

> Is that file still in used (because I can't find it in the pg_class
> table)?

Yes.  Don't delete it manually.

> Will postgres clean the database up or should I do a manual clean?

You probably need a VACUUM; or, if it's an index, a REINDEX.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Granting software the freedom to evolve guarantees only different results,
not better ones." (Zygo Blaxell)

Re: abnormal data grow

From

Reynard Hilman

Date:

21 September 2004, 16:45:29

>>Is that file still in used (because I can't find it in the pg_class
>>table)?
>>
>>
>Yes.  Don't delete it manually.
>
>
Thanks for that advice :)  I wasn't really going to delete it, just
tempted to.

following Tom's advice, this query:
select relname from pg_class where relfilenode = 25677563;
returns pg_toast_25677561 which is the record in pg_class that has the biggest relpages, so that makes sense.

I'll try running reindex tonight then, hopefully that does some clean up.

thanks for all the advice,
- reynard

gist index woes

From

Pierre-Frédéric Caillaud

Date:

21 September 2004, 17:12:42

    Trying to build a gist index on a column in a table.
    The table contains 100k rows.
    The column is an integer[]. Each row contains about 20-30 distinct values
chosen between 1 and 437.
    Aim : search the arrays with the gist integer array operators @ etc.

    Creating the index with gist__int_ops takes forever and, after something
like one hour, fills the disk to the brim. There is about 4G free space on
the partition and it eats it completely.

    Doing the same with only 10k rows takes forever too. Bumping up sort_mem
to 128 Meg does nothing. Only way to make it work is to create the index
on an empty table, and insert the rows afterwards. I only tried 10K rows
as disk space consumption is alarming.

    Search speed in this 10k rows using the index is a lot slower (10x) than
sequential scan.

    A problem with this contrib module ?

Re: abnormal data grow

From

Tom Lane

Date:

21 September 2004, 17:32:13

Reynard Hilman <reynardmh@lightsky.com> writes:
> following Tom's advice, this query:
> select relname from pg_class where relfilenode = 25677563;
> returns pg_toast_25677561 which is the record in pg_class that has the biggest relpages, so that makes sense.

Okay, so you have a lot of wide (toasted) fields in whatever table that
toast table belongs to --- if you're not sure, try

select relname from pg_class where
reltoastrelid = (select oid from pg_class where relfilenode = 25677563);

VACUUM VERBOSE on that table would give some useful info.

            regards, tom lane

Re: abnormal data grow

From

Reynard Hilman

Date:

21 September 2004, 19:39:08

>Okay, so you have a lot of wide (toasted) fields in whatever table that
>toast table belongs to --- if you're not sure, try
>
>select relname from pg_class where
>reltoastrelid = (select oid from pg_class where relfilenode = 25677563);
>
>VACUUM VERBOSE on that table would give some useful info.
>
>            regards, tom lane
>
>
Thank you Tom! This query finds the culprit right away. There is one
table that acts as a log file, storing the transactions sql that's
happening on the db, so the db size grows when the logging is turned on.
After doing some cleaning on this table, it shrinks the database to
500Mb again.
Just curious though, why the toast file for this transaction table takes
the most space (the transaction table itself was only about 10Mb),
considering there are a few other tables with bigger size (40Mb) than
this transaction table but do not have toast file that's comparable in
size to this one.

- reynard

Re: gist index woes

From

Oleg Bartunov

Date:

22 September 2004, 10:43:23

Have you tried gist__intbig_ops ?

Oleg
On Tue, 21 Sep 2004, [iso-8859-15] Pierre-Fr?d?ric Caillaud wrote:

>
>     Trying to build a gist index on a column in a table.
>     The table contains 100k rows.
>     The column is an integer[]. Each row contains about 20-30 distinct values
> chosen between 1 and 437.
>     Aim : search the arrays with the gist integer array operators @ etc.
>
>     Creating the index with gist__int_ops takes forever and, after something
> like one hour, fills the disk to the brim. There is about 4G free space on
> the partition and it eats it completely.
>
>     Doing the same with only 10k rows takes forever too. Bumping up sort_mem
> to 128 Meg does nothing. Only way to make it work is to create the index
> on an empty table, and insert the rows afterwards. I only tried 10K rows
> as disk space consumption is alarming.
>
>     Search speed in this 10k rows using the index is a lot slower (10x) than
> sequential scan.
>
>     A problem with this contrib module ?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

    Regards,
        Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

Re: gist index woes

From

Pierre-Frédéric Caillaud

Date:

22 September 2004, 18:04:49

    Thanks for the hint, it did work.
    The table weights 270 MB, the index weights 580 MB, but searches are fast.


> Have you tried gist__intbig_ops ?
>
> Oleg
> On Tue, 21 Sep 2004, [iso-8859-15] Pierre-Fr?d?ric Caillaud wrote:
>
>>
>>     Trying to build a gist index on a column in a table.
>>     The table contains 100k rows.
>>     The column is an integer[]. Each row contains about 20-30 distinct
>> values
>> chosen between 1 and 437.
>>     Aim : search the arrays with the gist integer array operators @ etc.
>>
>>     Creating the index with gist__int_ops takes forever and, after
>> something
>> like one hour, fills the disk to the brim. There is about 4G free space
>> on
>> the partition and it eats it completely.
>>
>>     Doing the same with only 10k rows takes forever too. Bumping up
>> sort_mem
>> to 128 Meg does nothing. Only way to make it work is to create the index
>> on an empty table, and insert the rows afterwards. I only tried 10K rows
>> as disk space consumption is alarming.
>>
>>     Search speed in this 10k rows using the index is a lot slower (10x)
>> than
>> sequential scan.
>>
>>     A problem with this contrib module ?
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 3: if posting/reading through Usenet, please send an appropriate
>>       subscribe-nomail command to majordomo@postgresql.org so that your
>>       message can get through to the mailing list cleanly
>>
>
>     Regards,
>         Oleg
> _____________________________________________________________
> Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
> Sternberg Astronomical Institute, Moscow University (Russia)
> Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
> phone: +007(095)939-16-83, +007(095)939-23-83
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

Re: abnormal data grow

From

"Ed L."

Date:

22 September 2004, 23:01:38

On Tuesday September 21 2004 12:40, Reynard Hilman wrote:
> >Okay, so you have a lot of wide (toasted) fields in whatever table that
> >toast table belongs to ...
>
> Just curious though, why the toast file for this transaction table takes
> the most space (the transaction table itself was only about 10Mb),
> considering there are a few other tables with bigger size (40Mb) than
> this transaction table but do not have toast file that's comparable in
> size to this one.

Not sure this is your problem, but when I have tables whose on-disk size is
significantly larger than the size reported from contrib's dbsize.sql
queries (if that's what you're using), I begin to suspect diskspace
"leakage" due to insufficient postgresql.conf settings for max_fsm_pages
and max_fsm_relations.  Those settings need to be bigger than the max
amount of diskspace and number of relations that you'll ever use, or you
may experience this leakage, requiring a "vacuum full" or dump/reload
(alias downtime) to repair.

Ed