On 08 November 2013 18:35 Amit Kapila wrote:
> On Fri, Nov 8, 2013 at 10:56 AM, Haribabu kommi
> <haribabu.kommi@huawei.com> wrote:
> > On 07 November 2013 09:42 Amit Kapila wrote:
> >> I am not sure whether the same calculation as done for
> new_rel_tuples
> >> works for new_dead_tuples, you can once check it.
> >
> > I didn't find any way to calculate new_dead_tuples like
> new_rel_tuples.
> > I will check it.
> >
> >> I am thinking that if we have to do estimation anyway, then wouldn't
> >> it be better to do the way Tom had initially suggested (Maybe we
> >> could have VACUUM copy the n_dead_tuples value as it exists when
> >> VACUUM starts, and then send that as the value to subtract when it's
> >> done?)
> >>
> >> I think the reason you gave that due to tuple visibility check the
> >> number of dead tuples calculated by above logic is not accurate is
> >> right but still it will make the value of dead tuples more
> >> appropriate than it's current value.
> >>
> >> You can check if there is a way to do estimation of dead tuples
> >> similar to new tuples, and it will be as solid as current logic of
> >> vac_estimate_reltuples(), then it's okay, otherwise use the other
> >> solution (using the value of n_dead_tuples at start of Vacuum) to
> >> solve the problem.
> >
> > The two approaches calculations are approximation values only.
> >
> > 1. Taking a copy of n_dead_tuples before VACUUM starts and then
> subtract it once it is done.
> > This approach doesn't include the tuples which are remains during
> the vacuum operation.
>
> Wouldn't next or future vacuum's will make the estimate more
> appropraite?
Possible only when nkeep counter value (tuples not cleaned) is very less value.
> > 2. nkeep counter contains the tuples which are still visible to other
> transactions.
> > This approach doesn't include tuples which are deleted on pages
> where vacuum operation is already finished.
> >
> > In my opinion the second approach gives the value nearer to the
> actual
> > value, because it includes some of the new dead tuples also. Please
> correct me if anything wrong in my analysis.
> I think main problem in nkeep logic is to come up with an estimation
> algorithm similar to live tuples.
>
> By the way, do you have test case or can you try to write a test case
> which can show this problem and then after fix, you can verify if the
> problem is resolved.
The simulated index bloat problem can be generated using the attached script and sql.
With the fix of setting the dead tuples properly, the bloat is reduced and by changing the vacuum cost
Parameters the bloat is avoided.
The advantage with the fix is observed is the more number of times the auto vacuum is triggered on
The bloated table, as it satisfies the vacuum criteria because of proper dead tuples compared to the
original code.
Regards,
Hari babu.