Re: [Patch] Optimize dropping of relation buffers using dlist - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: [Patch] Optimize dropping of relation buffers using dlist
Date
Msg-id CAA4eK1KsMqP80Gtg77muPD7hqpn_d2f_YfDRxKZfuDERztn6gA@mail.gmail.com
Whole thread Raw
In response to RE: [Patch] Optimize dropping of relation buffers using dlist  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Responses RE: [Patch] Optimize dropping of relation buffers using dlist
List pgsql-hackers
On Fri, Sep 25, 2020 at 2:25 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
>
> From: Amit Kapila <amit.kapila16@gmail.com>
> > No, during recovery also we need to be careful. We need to ensure that
> > we use cached value during recovery and cached value is always
> > up-to-date. We can't rely on lseek and I have provided some scenario
> > up thread [1] where such behavior can cause problem and then see the
> > response from Tom Lane why the same can be true for recovery as well.
> >
> > The basic approach we are trying to pursue here is to rely on the
> > cached value of 'number of blocks' (as that always gives correct value
> > and even if there is a problem that will be our bug, we don't need to
> > rely on OS for correct value and it will be better w.r.t performance
> > as well). It is currently only possible during recovery so we are
> > using it in recovery path and later once Thomas's patch to cache it
> > for non-recovery cases is also done, we can use it for non-recovery
> > cases as well.
>
> Although I may be still confused, I understood that Kirk-san's patch should:
>
> * Still focus on speeding up the replay of TRUNCATE during recovery.
>
> * During recovery, DropRelFileNodeBuffers() gets the cached size of the relation fork.  If it is cached, trust it and
optimizethe buffer invalidation.  If it's not cached, we can't trust the return value of smgrnblocks() because it's the
lseek(END)return value, so we avoid the optimization. 
>

I agree with the above two points.

> * Then, add a new function, say, smgrnblocks_cached() that simply returns the cached block count, and
DropRelFileNodeBuffers()uses it instead of smgrnblocks(). 
>

I am not sure if it worth adding a new function for this. Why not
simply add a boolean variable in smgrnblocks for this? BTW, AFAICS,
the latest patch doesn't have code to address this point.

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Dynamic gathering the values for seq_page_cost/xxx_cost
Next
From: Amit Kapila
Date:
Subject: Re: [Patch] Optimize dropping of relation buffers using dlist