Thread: cost based vacuum (parallel)

cost based vacuum (parallel)

From

Amit Kapila

Date:

04 November 2019, 06:54:35

For parallel vacuum [1], we were discussing what is the best way to
divide the cost among parallel workers but we didn't get many inputs
apart from people who are very actively involved in patch development.
I feel that we need some more inputs before we finalize anything, so
starting a new thread.

The initial version of the patch has a very rudimentary way of doing
it which means each parallel vacuum worker operates independently
w.r.t vacuum delay and cost. This will lead to more I/O in the system
than the user has intended to do. Assume that the overall I/O allowed
for vacuum operation is X after which it will sleep for some time,
reset the balance and continue. In the patch, each worker will be
allowed to perform X before which it can sleep and also there is no
coordination for the same with master backend which would have done
some I/O for the heap. So, in the worst-case scenario, there can be n
times more I/O where n is the number of workers doing the parallel
operation. This is somewhat similar to a memory usage problem with a
parallel query where each worker is allowed to use up to work_mem of
memory. We can say that the users using parallel operation can expect
more system resources to be used as they want to get the operation
done faster, so we are fine with this. However, I am not sure if that
is the right thing, so we should try to come up with some solution for
it and if the solution is too complex, then probably we can think of
documenting such behavior.

The two approaches to solve this problem being discussed in that
thread [1] are as follows:
(a) Allow the parallel workers and master backend to have a shared
view of vacuum cost related parameters (mainly VacuumCostBalance) and
allow each worker to update it and then based on that decide whether
it needs to sleep. Sawada-San has done the POC for this approach.
See v32-0004-PoC-shared-vacuum-cost-balance in email [2]. One
drawback of this approach could be that we allow the worker to sleep
even though the I/O has been performed by some other worker.

(b) The other idea could be that we split the I/O among workers
something similar to what we do for auto vacuum workers (see
autovac_balance_cost). The basic idea would be that before launching
workers, we need to compute the remaining I/O (heap operation would
have used something) after which we need to sleep and split it equally
across workers. Here, we are primarily thinking of dividing
VacuumCostBalance and VacuumCostLimit parameters. Once the workers
are finished, they need to let master backend know how much I/O they
have consumed and then master backend can add it to it's current I/O
consumed. I think we also need to rebalance the cost of remaining
workers once some of the worker's exit. Dilip has prepared a POC
patch for this, see 0002-POC-divide-vacuum-cost-limit in email [3].

I think approach-2 is better in throttling the system as it doesn't
have the drawback of the first approach, but it might be a bit tricky
to implement.

As of now, the POC for both the approaches has been developed and we
see similar results for both approaches, but we have tested simpler
cases where each worker has similar amount of I/O to perform.

Thoughts?

[1] - https://commitfest.postgresql.org/25/1774/
[2] - https://www.postgresql.org/message-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD%2BXOh_Q%40mail.gmail.com
[3] - https://www.postgresql.org/message-id/CAFiTN-thU-z8f04jO7xGMu5yUUpTpsBTvBrFW6EhRf-jGvEz%3Dg%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Thread: cost based vacuum (parallel)

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment

Attachment