Re: Per table autovacuum vacuum cost limit behaviour strange - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Per table autovacuum vacuum cost limit behaviour strange |
Date | |
Msg-id | CAA4eK1KeSDGy1taebUG5OO7SYW34eSSbmqVaQ5-XwhsQfP=Cxg@mail.gmail.com Whole thread Raw |
In response to | Re: Per table autovacuum vacuum cost limit behaviour strange (Alvaro Herrera <alvherre@2ndquadrant.com>) |
List | pgsql-hackers |
On Tue, Aug 26, 2014 at 9:49 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> So my proposal is a bit more complicated. First we introduce the notion
> of a single number, to enable sorting and computations: the "delay
> equivalent", which is the cost_limit divided by cost_delay. The highest
> the value is for any table, the fastest it is vacuumed. (It makes sense
> in physical terms: a higher cost_limit makes it faster, because vacuum
> sleeps less often; and a higher cost_delay makes it go slower, because
> vacuums sleeps for longer.) Now, the critical issue is to notice that
> not all tables are equal; they can be split in two groups, those that go
> faster than the global delay equivalent
> (i.e. the effective values of GUC variables
> autovacuum_vacuum_cost_limit/autovacuum_vacuum_cost_delay), and those
> that go equal or slower. For the latter group, the rebalancing
> algorithm "distributes" the allocated I/O by the global vars, in a
> pro-rated manner. For the former group (tables vacuumed faster than
> global delay equiv), to rebalance we don't consider the global delay
> equiv but the delay equiv of the fastest table currently being vacuumed.
>
> Suppose we have two tables, delay_equiv=10 each (which is the default
> value). If they are both vacuumed in parallel, then we distribute a
> delay_equiv of 5 to each (so set cost_limit=100, cost_delay=20). As
> soon as one of them finishes, the remaining one is allowed to upgrade to
> delay_equiv=10 (cost_limit=200, cost_delay=20).
>
> Now add a third table, delay_equiv=500 (cost_limit=10000, cost_delay=20;
> this is Mark's volatile table). If it's being vacuumed on its own, just
> assign cost_limit=10000 cost_delay=20, as normal. If one of the other
> two tables are being vacuumed, that one will use delay_equiv=10, as per
> above. To balance the volatile table, we take the delay_equiv of this
> one and subtract the already handed-out delay_equiv of 10; so we set the
> volatile table to delay_equiv=490 (cost_limit=9800, cost_delay=20).
>
> If we do it this way, the whole system is running at the full speed
> enabled by the fastest table we have set the per-table options, but also
> we have scaled things so that the slow tables go slow and the fast
> tables go fast.
>
> As a more elaborate example, add a fourth table with delay_equiv=50
> (cost_limit=1000, cost_delay=20). This is also faster than the global
> vars, so we put it in the first group. If all four tables are being
> vacuumed in parallel, we have the two slow tables going at delay_equiv=5
> each (cost_limit=100, cost_delay=20); then there are delay_equiv=490 to
> distribute among the remaining ones; pro-rating this we have
> delay_equiv=445 (cost_limit=8900, cost_delay=20) for the volatile table
> and delay_equiv=45 (cost_limit=900, cost_delay=20) for the other one.
How will this calculation behave if third table has delay_equiv = 30
and fourth table has delay_equiv = 20 which are both greater than
>
> So my proposal is a bit more complicated. First we introduce the notion
> of a single number, to enable sorting and computations: the "delay
> equivalent", which is the cost_limit divided by cost_delay. The highest
> the value is for any table, the fastest it is vacuumed. (It makes sense
> in physical terms: a higher cost_limit makes it faster, because vacuum
> sleeps less often; and a higher cost_delay makes it go slower, because
> vacuums sleeps for longer.) Now, the critical issue is to notice that
> not all tables are equal; they can be split in two groups, those that go
> faster than the global delay equivalent
> (i.e. the effective values of GUC variables
> autovacuum_vacuum_cost_limit/autovacuum_vacuum_cost_delay), and those
> that go equal or slower. For the latter group, the rebalancing
> algorithm "distributes" the allocated I/O by the global vars, in a
> pro-rated manner. For the former group (tables vacuumed faster than
> global delay equiv), to rebalance we don't consider the global delay
> equiv but the delay equiv of the fastest table currently being vacuumed.
>
> Suppose we have two tables, delay_equiv=10 each (which is the default
> value). If they are both vacuumed in parallel, then we distribute a
> delay_equiv of 5 to each (so set cost_limit=100, cost_delay=20). As
> soon as one of them finishes, the remaining one is allowed to upgrade to
> delay_equiv=10 (cost_limit=200, cost_delay=20).
>
> Now add a third table, delay_equiv=500 (cost_limit=10000, cost_delay=20;
> this is Mark's volatile table). If it's being vacuumed on its own, just
> assign cost_limit=10000 cost_delay=20, as normal. If one of the other
> two tables are being vacuumed, that one will use delay_equiv=10, as per
> above. To balance the volatile table, we take the delay_equiv of this
> one and subtract the already handed-out delay_equiv of 10; so we set the
> volatile table to delay_equiv=490 (cost_limit=9800, cost_delay=20).
>
> If we do it this way, the whole system is running at the full speed
> enabled by the fastest table we have set the per-table options, but also
> we have scaled things so that the slow tables go slow and the fast
> tables go fast.
>
> As a more elaborate example, add a fourth table with delay_equiv=50
> (cost_limit=1000, cost_delay=20). This is also faster than the global
> vars, so we put it in the first group. If all four tables are being
> vacuumed in parallel, we have the two slow tables going at delay_equiv=5
> each (cost_limit=100, cost_delay=20); then there are delay_equiv=490 to
> distribute among the remaining ones; pro-rating this we have
> delay_equiv=445 (cost_limit=8900, cost_delay=20) for the volatile table
> and delay_equiv=45 (cost_limit=900, cost_delay=20) for the other one.
How will this calculation behave if third table has delay_equiv = 30
and fourth table has delay_equiv = 20 which are both greater than
default delay_equiv = 10, so they will participate in fast group, as
per my understanding from above calculation both might get same
delay_equiv, but I might be wrong because still your patch has
FixMe and I haven't yet fully understood the code of patch.
In general, I have a feeling that distributing vacuuming speed is
a good way to tune the system, however if user wants to override
that by providing specific values for particular tables, we should
honour that setting.
pgsql-hackers by date: