Re: [HACKERS] Block level parallel vacuum WIP - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: [HACKERS] Block level parallel vacuum WIP
Date
Msg-id CAD21AoDWyL5wM2C6KLQcqgAbLnQwZsdgZ99Wqs+SiuQXAZuDgg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum WIP  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Tue, Jan 10, 2017 at 3:46 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Jan 9, 2017 at 2:18 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> On Sat, Jan 7, 2017 at 2:47 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Fri, Jan 6, 2017 at 11:08 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>> On Mon, Oct 3, 2016 at 11:00 AM, Michael Paquier
>>>> <michael.paquier@gmail.com> wrote:
>>>>> On Fri, Sep 16, 2016 at 6:56 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>>>> Yeah, I don't have a good solution for this problem so far.
>>>>>> We might need to improve group locking mechanism for the updating
>>>>>> operation or came up with another approach to resolve this problem.
>>>>>> For example, one possible idea is that the launcher process allocates
>>>>>> vm and fsm enough in advance in order to avoid extending fork relation
>>>>>> by parallel workers, but it's not resolve fundamental problem.
>>>>>
>>>>
>>>> I got some advices at PGConf.ASIA 2016 and started to work on this again.
>>>>
>>>> The most big problem so far is the group locking. As I mentioned
>>>> before, parallel vacuum worker could try to extend the same visibility
>>>> map page at the same time. So we need to make group locking conflict
>>>> in some cases, or need to eliminate the necessity of acquiring
>>>> extension lock. Attached 000 patch uses former idea, which makes the
>>>> group locking conflict between parallel workers when parallel worker
>>>> tries to acquire extension lock on same page.
>>>>
>>>
>>> How are planning to ensure the same in deadlock detector?  Currently,
>>> deadlock detector considers members from same lock group as
>>> non-blocking.  If you think we don't need to make any changes in
>>> deadlock detector, then explain why so?
>>>
>>
>> Thank you for comment.
>> I had not considered necessity of dead lock detection support. But
>> because lazy_scan_heap actquires the relation extension lock and
>> release it before acquiring another extension lock, I guess we don't
>> need that changes for parallel lazy vacuum. Thought?
>>
>
> Okay, but it is quite possible that lazy_scan_heap is not able to
> acquire the required lock as that is already acquired by another
> process (which is not part of group performing Vacuum), then all the
> processes in a group might need to run deadlock detector code wherein
> multiple places, it has been assumed that group members won't
> conflict.  As an example, refer code in TopoSort where it is trying to
> emit all groupmates together and IIRC, the basis of that part of the
> code is groupmates won't conflict with each other and this patch will
> break that assumption.  I have not looked into the parallel vacuum
> patch, but changes in 000_make_group_locking_conflict_extend_lock_v2
> doesn't appear to be safe. Even if your parallel vacuum patch doesn't
> need any change in deadlock detector, then also as proposed it appears
> that changes in locking will behave same for any of the operations
> performing relation extension.  So in future any parallel operation
> (say parallel copy/insert) which involves relation extension lock will
> behave similary.  Is that okay or are you assuming that the next
> person developing any such feature should rethink about this problem
> and extends your solution to match his requirement.

Thank you for expatiation. I agree that we should support dead lock
detection as well in this patch even if this feature doesn't need that
actually. I'm going to extend 000 patch to support dead lock
detection.

>
>
>> What do we actually gain from having the other parts of VACUUM execute
>> in parallel? Does truncation happen faster in parallel?
>>
>
> I think all CPU intensive operations for heap (like checking of
> dead/live rows, processing of dead tuples, etc.) can be faster.

Vacuum on table with no index can be faster as well.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: [HACKERS] postgres_fdw bug in 9.6
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] pg_hba_file_settings view patch