Thread: best practice to avoid table bloat?
Hi,
if I have a table that daily at night is deleted about 8 millions of rows (table maybe has 9 millions) is recommended to do a vacuum analyze after delete completes or can I leave this job to autovacuum?
This table is very active during the day but less active during night
I think that the only only thing where Postgres is weak, is in this area (table and index bloat).
For some reason for the same amount of data every day postgres consume a little more.
Thanks!
On 08/16/2012 04:33 PM, Anibal David Acosta wrote: > > Hi, > > if I have a table that daily at night is deleted about 8 millions of > rows (table maybe has 9 millions) is recommended to do a vacuum > analyze after delete completes or can I leave this job to autovacuum? > > This table is very active during the day but less active during night > > I think that the only only thing where Postgres is weak, is in this > area (table and index bloat). > > For some reason for the same amount of data every day postgres consume > a little more. > > Check out pg_reorg. cheers andrew
"Anibal David Acosta" <aa@devshock.com> wrote: > if I have a table that daily at night is deleted about 8 millions > of rows (table maybe has 9 millions) is recommended to do a vacuum > analyze after delete completes or can I leave this job to > autovacuum? Deleting a high percentage of the rows should cause autovacuum to deal with the table the next time it wakes up, so an explicit VACUUM ANALYZE shouldn't be needed. > For some reason for the same amount of data every day postgres > consume a little more. How are you measuring the data and how are you measuring the space? And what version of PostgreSQL is this? -Kevin
Thanks Kevin. Postgres version is 9.1.4 (lastest) Every day the table has about 7 millions of new rows. The table hold the data for 60 days, so approx. the total rows must be around 420 millions. Every night a delete process run, and remove rows older than 60 days. So, the space used by postgres should not be increase drastically because every day arrive 7 millions of rows but also same quantity is deleted but my disk get out of space every 4 months. I must copy tables outside the server, delete local table and create it again, after this process I got again space for about 4 months. Maybe is a wrong autovacuum config, but is really complicate to understand what values are correct to avoid performance penalty but to keep table in good fit. I think that autovacuum configuration should have some like "auto-config" that recalculate every day which is the best configuration for the server condition Thanks! -----Mensaje original----- De: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov] Enviado el: jueves, 16 de agosto de 2012 04:52 p.m. Para: Anibal David Acosta; pgsql-performance@postgresql.org Asunto: Re: [PERFORM] best practice to avoid table bloat? "Anibal David Acosta" <aa@devshock.com> wrote: > if I have a table that daily at night is deleted about 8 millions of > rows (table maybe has 9 millions) is recommended to do a vacuum > analyze after delete completes or can I leave this job to autovacuum? Deleting a high percentage of the rows should cause autovacuum to deal with the table the next time it wakes up, so an explicit VACUUM ANALYZE shouldn't be needed. > For some reason for the same amount of data every day postgres consume > a little more. How are you measuring the data and how are you measuring the space? And what version of PostgreSQL is this? -Kevin
[please don't top-post] "Anibal David Acosta" <aa@devshock.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.go> wrote: >> "Anibal David Acosta" <aa@devshock.com> wrote: >> >>> if I have a table that daily at night is deleted about 8 >>> millions of rows (table maybe has 9 millions) is recommended to >>> do a vacuum analyze after delete completes or can I leave this >>> job to autovacuum? >> >> Deleting a high percentage of the rows should cause autovacuum to >> deal with the table the next time it wakes up, so an explicit >> VACUUM ANALYZE shouldn't be needed. > Every day the table has about 7 millions of new rows. > The table hold the data for 60 days, so approx. the total rows > must be around 420 millions. > Every night a delete process run, and remove rows older than 60 > days. Oh, I thought you were saying the table grew to 9 million rows each day and you deleted 8 million of them each night. That would definitely trigger autovacuum. Deleting 7 million rows from a table of 420 million rows would not, so an explicit VACUUM ANALYZE after the delete might be helpful. Even better, with a pattern like that, you might want to consider partitioning the table: http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html >>> For some reason for the same amount of data every day postgres >>> consume a little more. >> >> How are you measuring the data and how are you measuring the >> space? > [no answer] Without knowing what is increasing, it's hard to say why it is increasing. For all we know you are logging all statements and never deleting log files. The solution for that would be entirely different from the solution for some other problem. > So, the space used by postgres should not be increase drastically > because every day arrive 7 millions of rows but also same quantity > is deleted but my disk get out of space every 4 months. What is getting bigger over time? > I must copy tables outside the server, delete local table and > create it again, after this process I got again space for about 4 > months. How do you do that? pg_dump, DROP TABLE, restore the dump? Have you captured sizes of heap, toast, indexes, etc. before and after this aggressive maintenance? Is the size going up by orders of magnitude, or are you running really tight and getting killed by a 10% increase. We don't know unless you tell us. > Maybe is a wrong autovacuum config, but is really complicate to > understand what values are correct to avoid performance penalty > but to keep table in good fit. Please show us the entire result from running this query: http://wiki.postgresql.org/wiki/Server_Configuration -Kevin