Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAFiTN-snBzg9BVS_FUY1yQ3F0Tt53jW8rwgoW1bw4ptizRNVRQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Mon, Nov 4, 2019 at 10:45 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, Nov 3, 2019 at 9:49 AM Dilip Kumar <dilipbalaut@gmail.com> wrote:
> >
> > On Fri, Nov 1, 2019 at 2:21 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > I think that two approaches make parallel vacuum worker wait in
> > > different way: in approach(a) the vacuum delay works as if vacuum is
> > > performed by single process, on the other hand in approach(b) the
> > > vacuum delay work for each workers independently.
> > >
> > > Suppose that the total number of blocks to vacuum is 10,000 blocks,
> > > the cost per blocks is 10, the cost limit is 200 and sleep time is 5
> > > ms. In single process vacuum the total sleep time is 2,500ms (=
> > > (10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
> > > Because all parallel vacuum workers use the shared balance value and a
> > > worker sleeps once the balance value exceeds the limit. In
> > > approach(b), since the cost limit is divided evenly the value of each
> > > workers is 40 (e.g. when 5 parallel degree). And suppose each workers
> > > processes blocks  evenly,  the total sleep time of all workers is
> > > 12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
> > > compute the sleep time of approach(b) by dividing the total value by
> > > the number of parallel workers.
> > >
> > > IOW the approach(b) makes parallel vacuum delay much more than normal
> > > vacuum and parallel vacuum with approach(a) even with the same
> > > settings. Which behaviors do we expect? I thought the vacuum delay for
> > > parallel vacuum should work as if it's a single process vacuum as we
> > > did for memory usage. I might be missing something. If we prefer
> > > approach(b) I should change the patch so that the leader process
> > > divides the cost limit evenly.
> > >
> > I have repeated the same test (test1 and test2)[1] with a higher
> > shared buffer (1GB).  Currently, I have used the same formula for
> > computing the total delay
> > heap scan delay + index vacuuming delay / workers.  Because, In my
> > opinion, multiple workers are doing I/O here so the total delay should
> > also be in multiple
> > of the number of workers.  So if we want to compare the delay with the
> > sequential vacuum then we should divide total delay by the number of
> > workers.  But, I am not
> > sure whether computing the total delay is the right way to compute the
> > I/O throttling or not.  But, I support the approach (b) for dividing
> > the I/O limit because
> > auto vacuum workers are already operating with this approach.
> >
> > test1:
> > normal: stats delay 1348.160000, hit 68952, miss 2, dirty 10063, total 79017
> > 1 worker: stats delay 1349.585000, hit 68954, miss 2, dirty 10146,
> > total 79102 (cost divide patch)
> > 2 worker: stats delay 1341.416141, hit 68956, miss 2, dirty 10036,
> > total 78994 (cost divide patch)
> > 1 worker: stats delay 1025.495000, hit 78184, miss 2, dirty 14066,
> > total 92252 (share cost patch)
> > 2 worker: stats delay 904.366667, hit 86482, miss 2, dirty 17806,
> > total 104290 (share cost patch)
> >
> > test2:
> > normal: stats delay 530.475000, hit 36982, miss 2, dirty 3488, total 40472
> > 1 worker: stats delay 530.700000, hit 36984, miss 2, dirty 3527, total
> > 40513 (cost divide patch)
> > 2 worker: stats delay 530.675000, hit 36984, miss 2, dirty 3532, total
> > 40518 (cost divide patch)
> > 1 worker: stats delay 490.570000, hit 39090, miss 2, dirty 3497, total
> > 42589 (share cost patch)
> > 2 worker: stats delay 480.571667, hit 39050, miss 2, dirty 3819, total
> > 42871 (share cost patch)
> >
> > So with higher, shared buffers,  I can see with approach (b) we can
> > see the same total delay.  With approach (a) I can see a bit less
> > total delay.  But, a point to be noted that I have used the same
> > formulae for computing the total delay for both the approaches.  But,
> > Sawada-san explained in the above mail that it may not be the right
> > way to computing the total delay for the approach (a).  But my take is
> > that whether we are working with shared cost or we are dividing the
> > cost, the delay must be divided by number of workers in the parallel
> > phase.
> >
>
> Why do you think so?  I think with approach (b) if all the workers are
> doing equal amount of I/O, they will probably sleep at the same time
> whereas with approach (a) each of them will sleep at different times.
> So, probably dividing the delay in approach (b) makes more sense.

Just to be clear,  I did not mean that we divide the sleep time for
each worker.  Actually, I meant how to project the total delay in the
test patch.  So I think if we directly want to compare the sleep time
of the sequential vs parallel then it's not fair to just compare the
total sleep time because when multiple workers are working parallelly
shouldn't we need to consider their average sleep time?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [HACKERS] Block level parallel vacuum
Next
From: Dilip Kumar
Date:
Subject: Re: [HACKERS] Block level parallel vacuum