Thread: abnormal data grow
Hi, I have been having this problem where the database size suddenly grows from the normal size of about 300Mb to 12Gb in one night. When I look up the table size, the biggest one is only 41Mb and the total of all table size is only 223Mb. But in the filesystem data directory the total size is 12Gb. I noticed there are 10 files with 1Gb size each: 1.1G 25677563 1.1G 25677563.1 1.1G 25677563.2 1.1G 25677563.3 1.1G 25677563.4 1.1G 25677563.5 1.1G 25677563.6 1.1G 25677563.7 1.1G 25677563.8 1.1G 25677563.9 So I tried to find out what table that is, but this query return nothing: SELECT relname, relpages FROM pg_class WHERE relname = 'pg_toast_25677563' OR relname = 'pg_toast_25677563_index' ORDER BY relname; and if I run this query: SELECT relname, relpages FROM pg_class ORDER BY relpages desc the top ones are these tables (which is not the id of the 1.1 Gb files) pg_toast_25677561 1382845 pg_toast_25677561_index 22116 I'm just wondering is there a way to know what that 25677563 file is? Why does postgres create a copy of that file with .1, .2, .3, etc. Is that file still in used (because I can't find it in the pg_class table)? Will postgres clean the database up or should I do a manual clean? - reynard
Reynard Hilman <reynardmh@lightsky.com> writes: > But in the filesystem data directory the total size is 12Gb. I noticed > there are 10 files with 1Gb size each: > 1.1G 25677563 > 1.1G 25677563.1 > ... > I'm just wondering is there a way to know what that 25677563 file is? select relname from pg_class where relfilenode = 25677563; regards, tom lane
Reynard Hilman <reynardmh@lightsky.com> writes: > Hi, > > I have been having this problem where the database size suddenly grows > from the normal size of about 300Mb to 12Gb in one night. > When I look up the table size, the biggest one is only 41Mb and the > total of all table size is only 223Mb. > But in the filesystem data directory the total size is 12Gb. I noticed > there are 10 files with 1Gb size each: > 1.1G 25677563 > 1.1G 25677563.1 > 1.1G 25677563.2 > 1.1G 25677563.3 > 1.1G 25677563.4 > 1.1G 25677563.5 > 1.1G 25677563.6 > 1.1G 25677563.7 > 1.1G 25677563.8 > 1.1G 25677563.9 25677563 is the OID of this object. The different files are segments of the relation, not duplicates (PG restricts individual data file size to 1GB). So look for which relation has that OID--it's quite possibly an index that is bloating up. -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
On Tue, Sep 21, 2004 at 09:51:15AM -0500, Reynard Hilman wrote: > I'm just wondering is there a way to know what that 25677563 file is? > Why does postgres create a copy of that file with .1, .2, .3, etc. Those are not copies. Postgres splits each relation (table/index) in 1GB files. So the relation actually uses 10 GB; you need some cleanup. > Is that file still in used (because I can't find it in the pg_class > table)? Yes. Don't delete it manually. > Will postgres clean the database up or should I do a manual clean? You probably need a VACUUM; or, if it's an index, a REINDEX. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Granting software the freedom to evolve guarantees only different results, not better ones." (Zygo Blaxell)
>>Is that file still in used (because I can't find it in the pg_class >>table)? >> >> >Yes. Don't delete it manually. > > Thanks for that advice :) I wasn't really going to delete it, just tempted to. following Tom's advice, this query: select relname from pg_class where relfilenode = 25677563; returns pg_toast_25677561 which is the record in pg_class that has the biggest relpages, so that makes sense. I'll try running reindex tonight then, hopefully that does some clean up. thanks for all the advice, - reynard
Trying to build a gist index on a column in a table. The table contains 100k rows. The column is an integer[]. Each row contains about 20-30 distinct values chosen between 1 and 437. Aim : search the arrays with the gist integer array operators @ etc. Creating the index with gist__int_ops takes forever and, after something like one hour, fills the disk to the brim. There is about 4G free space on the partition and it eats it completely. Doing the same with only 10k rows takes forever too. Bumping up sort_mem to 128 Meg does nothing. Only way to make it work is to create the index on an empty table, and insert the rows afterwards. I only tried 10K rows as disk space consumption is alarming. Search speed in this 10k rows using the index is a lot slower (10x) than sequential scan. A problem with this contrib module ?
Reynard Hilman <reynardmh@lightsky.com> writes: > following Tom's advice, this query: > select relname from pg_class where relfilenode = 25677563; > returns pg_toast_25677561 which is the record in pg_class that has the biggest relpages, so that makes sense. Okay, so you have a lot of wide (toasted) fields in whatever table that toast table belongs to --- if you're not sure, try select relname from pg_class where reltoastrelid = (select oid from pg_class where relfilenode = 25677563); VACUUM VERBOSE on that table would give some useful info. regards, tom lane
>Okay, so you have a lot of wide (toasted) fields in whatever table that >toast table belongs to --- if you're not sure, try > >select relname from pg_class where >reltoastrelid = (select oid from pg_class where relfilenode = 25677563); > >VACUUM VERBOSE on that table would give some useful info. > > regards, tom lane > > Thank you Tom! This query finds the culprit right away. There is one table that acts as a log file, storing the transactions sql that's happening on the db, so the db size grows when the logging is turned on. After doing some cleaning on this table, it shrinks the database to 500Mb again. Just curious though, why the toast file for this transaction table takes the most space (the transaction table itself was only about 10Mb), considering there are a few other tables with bigger size (40Mb) than this transaction table but do not have toast file that's comparable in size to this one. - reynard
Have you tried gist__intbig_ops ? Oleg On Tue, 21 Sep 2004, [iso-8859-15] Pierre-Fr?d?ric Caillaud wrote: > > Trying to build a gist index on a column in a table. > The table contains 100k rows. > The column is an integer[]. Each row contains about 20-30 distinct values > chosen between 1 and 437. > Aim : search the arrays with the gist integer array operators @ etc. > > Creating the index with gist__int_ops takes forever and, after something > like one hour, fills the disk to the brim. There is about 4G free space on > the partition and it eats it completely. > > Doing the same with only 10k rows takes forever too. Bumping up sort_mem > to 128 Meg does nothing. Only way to make it work is to create the index > on an empty table, and insert the rows afterwards. I only tried 10K rows > as disk space consumption is alarming. > > Search speed in this 10k rows using the index is a lot slower (10x) than > sequential scan. > > A problem with this contrib module ? > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Thanks for the hint, it did work. The table weights 270 MB, the index weights 580 MB, but searches are fast. > Have you tried gist__intbig_ops ? > > Oleg > On Tue, 21 Sep 2004, [iso-8859-15] Pierre-Fr?d?ric Caillaud wrote: > >> >> Trying to build a gist index on a column in a table. >> The table contains 100k rows. >> The column is an integer[]. Each row contains about 20-30 distinct >> values >> chosen between 1 and 437. >> Aim : search the arrays with the gist integer array operators @ etc. >> >> Creating the index with gist__int_ops takes forever and, after >> something >> like one hour, fills the disk to the brim. There is about 4G free space >> on >> the partition and it eats it completely. >> >> Doing the same with only 10k rows takes forever too. Bumping up >> sort_mem >> to 128 Meg does nothing. Only way to make it work is to create the index >> on an empty table, and insert the rows afterwards. I only tried 10K rows >> as disk space consumption is alarming. >> >> Search speed in this 10k rows using the index is a lot slower (10x) >> than >> sequential scan. >> >> A problem with this contrib module ? >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 3: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that your >> message can get through to the mailing list cleanly >> > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend >
On Tuesday September 21 2004 12:40, Reynard Hilman wrote: > >Okay, so you have a lot of wide (toasted) fields in whatever table that > >toast table belongs to ... > > Just curious though, why the toast file for this transaction table takes > the most space (the transaction table itself was only about 10Mb), > considering there are a few other tables with bigger size (40Mb) than > this transaction table but do not have toast file that's comparable in > size to this one. Not sure this is your problem, but when I have tables whose on-disk size is significantly larger than the size reported from contrib's dbsize.sql queries (if that's what you're using), I begin to suspect diskspace "leakage" due to insufficient postgresql.conf settings for max_fsm_pages and max_fsm_relations. Those settings need to be bigger than the max amount of diskspace and number of relations that you'll ever use, or you may experience this leakage, requiring a "vacuum full" or dump/reload (alias downtime) to repair. Ed