On Wed, Feb 24, 2016 at 8:51 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> Thank you for remembering this problem, at least for me.
>
>>> Well, turns out there's a quite significant difference, actually. The
>>> index sizes I get (quite stable after multiple runs):
>>>
>>> 9.5 : 2428 MB
>>> 9.6 + alone cleanup : 730 MB
>>> 9.6 + pending lock : 488 MB
>
> Interesting, I don't see why alone_cleanup and pending_lock are so differ.
> I'd like to understand that, but does somebody have an good theory?
Under my patch, anyone who wanted to do a clean up and detected
someone else was doing one would wait for the concurrent one to end.
(This is more consistent with the existing behavior, I just made it so
they don't do any damage while they wait.)
Under your patch, if a backend wants to do a clean up and detects
someone else is already doing one, it would just skip the clean up and
proceed on with whatever it was doing. This allows one process
(hopefully a vacuum, but maybe a user backend) to get pinned down
indefinitely, as other processes keep putting stuff onto the end of
the pending_list with no throttle.
Since the freespace recycling only takes place once the list is
completely cleaned, allowing some processes to add to the end while
one poor process is trying to clean can lead to less effective
recycling.
That is my theory, anyway.
Cheers,
Jeff