Re: best practice to avoid table bloat? - Mailing list pgsql-performance

From Kevin Grittner
Subject Re: best practice to avoid table bloat?
Date
Msg-id 502D223702000025000497D7@gw.wicourts.gov
Whole thread Raw
In response to Re: best practice to avoid table bloat?  ("Anibal David Acosta" <aa@devshock.com>)
List pgsql-performance
[please don't top-post]

"Anibal David Acosta" <aa@devshock.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.go> wrote:
>> "Anibal David Acosta" <aa@devshock.com> wrote:
>>
>>> if I have a table that daily at night is deleted about 8
>>> millions of rows (table maybe has 9 millions) is recommended to
>>> do a vacuum analyze after delete completes or can I leave this
>>> job to autovacuum?
>>
>> Deleting a high percentage of the rows should cause autovacuum to
>> deal with the table the next time it wakes up, so an explicit
>> VACUUM ANALYZE shouldn't be needed.

> Every day the table has about 7 millions of new rows.
> The table hold the data for 60 days, so approx. the total rows
> must be around 420 millions.
> Every night a delete process run, and remove rows older than 60
> days.

Oh, I thought you were saying the table grew to 9 million rows each
day and you deleted 8 million of them each night.  That would
definitely trigger autovacuum.  Deleting 7 million rows from a table
of 420 million rows would not, so an explicit VACUUM ANALYZE after
the delete might be helpful.  Even better, with a pattern like that,
you might want to consider partitioning the table:

http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

>>> For some reason for the same amount of data every day postgres
>>> consume a little more.
>>
>> How are you measuring the data and how are you measuring the
>> space?

> [no answer]

Without knowing what is increasing, it's hard to say why it is
increasing.  For all we know you are logging all statements and
never deleting log files.  The solution for that would be entirely
different from the solution for some other problem.

> So, the space used by postgres should not be increase drastically
> because every day arrive 7 millions of rows but also same quantity
> is deleted but my disk get out of space every 4 months.

What is getting bigger over time?

> I must copy tables outside the server, delete local table and
> create it again, after this process I got again space for about 4
> months.

How do you do that?  pg_dump, DROP TABLE, restore the dump?  Have
you captured sizes of heap, toast, indexes, etc. before and after
this aggressive maintenance?  Is the size going up by orders of
magnitude, or are you running really tight and getting killed by a
10% increase.  We don't know unless you tell us.

> Maybe is a wrong autovacuum config, but is really complicate to
> understand what values are correct to avoid performance penalty
> but to keep table in good fit.

Please show us the entire result from running this query:

http://wiki.postgresql.org/wiki/Server_Configuration

-Kevin


pgsql-performance by date:

Previous
From: "Anibal David Acosta"
Date:
Subject: Re: best practice to avoid table bloat?
Next
From: Merlin Moncure
Date:
Subject: Re: High Disk write and space taken by PostgreSQL