Re: strange table disk sizes - Mailing list pgsql-general

From Rik Bellens
Subject Re: strange table disk sizes
Date
Msg-id 4E5F7CDA.3050907@telin.ugent.be
Whole thread Raw
In response to strange table disk sizes  (Rik Bellens <rik.bellens@telin.ugent.be>)
Responses Re: strange table disk sizes  (Scott Marlowe <scott.marlowe@gmail.com>)
Re: strange table disk sizes  (Scott Marlowe <scott.marlowe@gmail.com>)
List pgsql-general
Op 01-09-11 14:22, Scott Marlowe schreef:
> On Thu, Sep 1, 2011 at 6:05 AM, Rik Bellens<rik.bellens@telin.ugent.be>  wrote:
>> Op 01-09-11 13:31, Scott Marlowe schreef:
>>> On Thu, Sep 1, 2011 at 4:32 AM, Rik Bellens<rik.bellens@telin.ugent.be>
>>>   wrote:
>>>> Hello,
>>>>
>>>> I have two large tables in a database, one containing original data and
>>>> the
>>>> other one derived from the first table. The first table contains several
>>>> columns and indexes, while the second table has less columns and only one
>>>> index. Both tables have the same number of rows. Nevertheless, the second
>>>> table is much larger in disk size than the first one. How can this be
>>>> explained?
>>> This is most likely due to table bloat.  In PostgreSQL when you update
>>> or delete a row, a dead version gets left behind.  Vacuum eventually
>>> comes along and reclaims the empty space to be reused.  If you delete
>>> / update a LOT of rows at once, then you'll have a lot of dead rows
>>> which can only be reused after vacuuming when you do more updates or
>>> deletes later on.
>>>
>>> A few salient questions. What version of PostgreSQL are you running?
>>> Is autovacuum running? Do you do a LOT of bulk deletes / updates?  If
>>> you do a lot of bulk deletes on this table, and you delete everything,
>>> can you switch to using the truncate command instead?
>> I use version 8.3. I see the 'autovacuum launcher process' and 'autovacuum
>> worker process' in the process list, so I suppose autovacuum is running.
>>
>> Rows in the measurement table are added once and never deleted or updated.
>> Adding a row to this table triggers a function that adds a row to the
>> stats_count table. Normally rows are added chronologically. So rows in the
>> stats_count table are normally not updated either. If however, for some
>> reason, a measurement is added from an older time, all rows of that device
>> which come after this time, are updated, but I don't think this will happen
>> very often.
>>
>> The table 'stats_count' was created in a later stage, so the first 45M rows
>> were added at once and chronologically. However, because the function to
>> initialize this table took a long time and the client application crashed a
>> few times, I had to restart this function several times. Can it be that
>> there is some trash left from running this function several times without
>> finishing it? Would it be a solution to run 'VACUUM FULL' to reclaim some
>> disk space?
> Yeah, could be.  Take a look at this page:
> http://wiki.postgresql.org/wiki/Show_database_bloat and see if the
> query there sheds some light on your situ.
thanks for this answer

if i run the query, I get 12433752064 wasted bytes on stats_count_pkey,
so I suppose that is the reason


pgsql-general by date:

Previous
From: Rik Bellens
Date:
Subject: Re: strange table disk sizes
Next
From: Scott Marlowe
Date:
Subject: Re: strange table disk sizes