Thread: Full vacuuming of BIG tables takes too long
Hi, all. Just example. I have an table in ISP billing base, which every 2 months grows and contains up to 35,000,000 of records. That takes 13Gb of disk space. On that amount 'vacuum analyze' (every night) is inefficient, cause after it the table continues to grow (but not very fast). When trying to do 'vacuum full' it takes too long- I can wait only 5-6 hours (and that is not enough), cause it locks the table and the number of procecces, awaiting their inserts becomes too high. So it is much faster (40-50 mins) to dump the entire database, then drop it, recreate and resore it. I know that 'vacuum_mem = 65536' is not enough to do 'vacuum full' fast enough - but I wanna ask- if I dedcide to increase that number - will be 512 megs for example be better ? Is there any other init parameters that can helkp me ? Or speaking of such amount of data dump/recreate/restore will be the best way ? WBR, Eugene.
Hello Thierry, Thursday, May 22, 2003, 5:45:26 PM, you wrote: TM> Hi, TM> I don't have a solution but : TM> 1) Is your system spending time in Wait I/O, while vacuum analyze is running TM> ? Almost no. During fist 30 mins summary I/O (iostat 1) is 20/25 megs/sec, then only 3-5 megs/sec. the "i/o wait" counters in cpu activity are not too high. TM> Perhaps, you can save time by incrising I/O throughput. I.e. to use SCSI-HDD ? 8)) May be. But I stall hope tha the problem can be solved by increasing memory/tuning initialization parameters... TM> 2) In the alternative dump/recreate/restore, do you recreate the Foreign Key TM> ? This step takes long time (depending of your Database schema). I have try TM> this scenario : TM> Dump data / Drop Foreign Key / Truuncate Tables / restore / Recreate the TM> Foreign Key TM> The step Recreate FK takes 2 times the four first steps. I don't use foreign keys. Only primary keys. There is only 3 tables in that db. 99% of space is taken by one table. -- Best regards, Eugene mailto:emz@norma.perm.ru
On Thu, 2003-05-22 at 08:37, Eugene M. Zheganin wrote: My first suggestion would be to check your free space map settings in the postgresql.conf and make sure they are set high enough. Setting them to low can cause Vacuum Full to take longer than necessary. > Thursday, May 22, 2003, 5:45:26 PM, you wrote: > TM> 2) In the alternative dump/recreate/restore, do you recreate the Foreign Key > TM> ? This step takes long time (depending of your Database schema). I have try > TM> this scenario : > TM> Dump data / Drop Foreign Key / Truuncate Tables / restore / Recreate the > TM> Foreign Key > TM> The step Recreate FK takes 2 times the four first steps. > I don't use foreign keys. Only primary keys. There is only 3 tables in > that db. 99% of space is taken by one table. > Is it possible for you do Begin; Create Table t2 As Select * From t1; Drop Table t1 ; Alter table t2 Rename To t1; Commit; Note you might want to lock t1 from writers, but people could still select from it while you making the switch. Robert Treat
Hi, I don't have a solution but : 1) Is your system spending time in Wait I/O, while vacuum analyze is running ? Perhaps, you can save time by incrising I/O throughput. 2) In the alternative dump/recreate/restore, do you recreate the Foreign Key ? This step takes long time (depending of your Database schema). I have try this scenario : Dump data / Drop Foreign Key / Truuncate Tables / restore / Recreate the Foreign Key The step Recreate FK takes 2 times the four first steps. Thierry "Eugene M. Zheganin" wrote: > Hi, all. > > Just example. I have an table in ISP billing base, which every 2 > months grows and contains up to 35,000,000 of records. That takes > 13Gb of disk space. On that amount 'vacuum analyze' (every night) is > inefficient, cause after it the table continues to grow (but not > very fast). > > When trying to do 'vacuum full' it takes too long- I can wait only > 5-6 hours (and that is not enough), cause it locks the table and > the number of procecces, awaiting their inserts becomes too high. > So it is much faster (40-50 mins) to dump the entire database, > then drop it, recreate and resore it. > > I know that 'vacuum_mem = 65536' is not enough to do 'vacuum full' > fast enough - but I wanna ask- if I dedcide to increase that number > - will be 512 megs for example be better ? > > Is there any other init parameters that can helkp me ? > > Or speaking of such amount of data dump/recreate/restore will be the > best way ? > > WBR, Eugene. > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
Attachment
"Eugene M. Zheganin" <emz@norma.perm.ru> writes: > Just example. I have an table in ISP billing base, which every 2 > months grows and contains up to 35,000,000 of records. That takes > 13Gb of disk space. On that amount 'vacuum analyze' (every night) is > inefficient, cause after it the table continues to grow (but not > very fast). If you aren't getting results from plain vacuum then you need to increase your FSM settings. regards, tom lane
You are likely to run into unhappy situations if you use views, rules, etc. Depending on the version of PG it will either refuse to drop the table if there are dependencies or it will drop it but the views won't be "connected" to the new version of t1 after you run your opreation. Cheers, Steve On Thursday 22 May 2003 07:01, Robert Treat wrote: > Is it possible for you do Begin; Create Table t2 As Select * From t1; > Drop Table t1 ; Alter table t2 Rename To t1; Commit; > > Note you might want to lock t1 from writers, but people could still > select from it while you making the switch. > > Robert Treat
Hello Robert, Thursday, May 22, 2003, 8:01:58 PM, you wrote: RT> On Thu, 2003-05-22 at 08:37, Eugene M. Zheganin wrote: RT> My first suggestion would be to check your free space map settings in RT> the postgresql.conf and make sure they are set high enough. They are set to defaults. RT> Setting RT> them to low can cause Vacuum Full to take longer than necessary. I read " Server Run-time Environment" from Postgresql docs, but I don't understand how I can calculate optimum values ? (where I can read about it in pgsql docs ?) Can you give me any further explanations 8) ? -- Best regards, Eugene mailto:emz@norma.perm.ru