Re: Heavily modified big table bloat even in auto vacuum is running - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Heavily modified big table bloat even in auto vacuum is running
Date
Msg-id CAA4eK1JjxcyBzGf2GfHpNe8JgzKQ4RE6exNsvVjc3R3PcuFi9w@mail.gmail.com
Whole thread Raw
In response to Re: Heavily modified big table bloat even in auto vacuum is running  (Haribabu kommi <haribabu.kommi@huawei.com>)
Responses Re: Heavily modified big table bloat even in auto vacuum is running  (Haribabu kommi <haribabu.kommi@huawei.com>)
List pgsql-hackers
On Mon, Nov 11, 2013 at 3:14 PM, Haribabu kommi
<haribabu.kommi@huawei.com> wrote:
> On 08 November 2013 18:35 Amit Kapila wrote:
>> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
>> <haribabu.kommi@huawei.com> wrote:
>> > On 07 November 2013 09:42 Amit Kapila wrote:
>> >> I am not sure whether the same calculation as done for
>> new_rel_tuples
>> >> works for new_dead_tuples, you can once check it.
>> >
>> > I didn't find any way to calculate new_dead_tuples like
>> new_rel_tuples.
>> > I will check it.
>> >
>> > The two approaches calculations are approximation values only.
>> >
>> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
>> subtract it once it is done.
>> >    This approach doesn't include the tuples which are remains during
>> the vacuum operation.
>>
>>       Wouldn't next or future vacuum's will make the estimate more
>> appropraite?
>
> Possible only when nkeep counter value (tuples not cleaned) is very less value.
  Do you really expect too many dead tuples during Vacuum?

>> > 2. nkeep counter contains the tuples which are still visible to other
>> transactions.
>> >    This approach doesn't include tuples which are deleted on pages
>> where vacuum operation is already finished.
>> >
>> > In my opinion the second approach gives the value nearer to the
>> actual
>> > value, because it includes some of the new dead tuples also. Please
>> correct me if anything wrong in my analysis.
>>    I think main problem in nkeep logic is to come up with an estimation
>> algorithm similar to live tuples.
>>
>> By the way, do you have test case or can you try to write a test case
>> which can show this problem and then after fix, you can verify if the
>> problem is resolved.
>
> The simulated index bloat problem can be generated using the attached script and sql.
> With the fix of setting the dead tuples properly,
  Which fix here you are referring to, is it the one which you have
proposed with your initial mail?

> the bloat is reduced and by changing the vacuum cost
> Parameters the bloat is avoided.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Fabrízio de Royes Mello
Date:
Subject: Re: pg_dump and pg_dumpall in real life
Next
From: Craig Ringer
Date:
Subject: Re: Re: Exempting superuser from row-security isn't enough. Run predicates as DEFINER?