Thread: [HACKERS] pg_basebackup throttling doesn't throttle as promised

[HACKERS] pg_basebackup throttling doesn't throttle as promised

From
Jeff Janes
Date:
The "-r" option to pg_basebackup is supposed to throttle the rate of the backup.  But it only works properly if the server is mostly idle.

Every non-trivial call to XLogFlush or XLogBackgroundFlush will wake up the wal sender (the one which is not really sending wal, but base files), and the throttling routine doesn't go back to sleep after being awoke early.  Rather, it releases another 32kb of data.


Should the basebackup.c throttle sleep in a loop until its time has expired?  

Or should walsender.c WalSndWakeup not wake a wal sender whose status is WALSNDSTATE_BACKUP?

Or both?

Cheers,

Jeff

Re: [HACKERS] pg_basebackup throttling doesn't throttle as promised

From
Jeff Janes
Date:
On Fri, Sep 1, 2017 at 1:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
The "-r" option to pg_basebackup is supposed to throttle the rate of the backup.  But it only works properly if the server is mostly idle.

Every non-trivial call to XLogFlush or XLogBackgroundFlush will wake up the wal sender (the one which is not really sending wal, but base files), and the throttling routine doesn't go back to sleep after being awoke early.  Rather, it releases another 32kb of data.


Should the basebackup.c throttle sleep in a loop until its time has expired?  

Or should walsender.c WalSndWakeup not wake a wal sender whose status is WALSNDSTATE_BACKUP?

Or both?


I'm attaching a patch for each option.  Each one independently solves the problem.  But I think we should do both.  There is no point in issuing unnecessary kill system calls, and there may also be more spurious wake-ups than just these ones.

Cheers,

Jeff
Attachment

Re: [HACKERS] pg_basebackup throttling doesn't throttle as promised

From
Antonin Houska
Date:
Jeff Janes <jeff.janes@gmail.com> wrote:

> On Fri, Sep 1, 2017 at 1:32 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>  The "-r" option to pg_basebackup is supposed to throttle the rate of the
>  backup. But it only works properly if the server is mostly idle.
>
>  Every non-trivial call to XLogFlush or XLogBackgroundFlush will wake up the
>  wal sender (the one which is not really sending wal, but base files), and
>  the throttling routine doesn't go back to sleep after being awoke
>  early. Rather, it releases another 32kb of data.

Sorry, I missed this fact when coding the feature.

>  Should the basebackup.c throttle sleep in a loop until its time has
>  expired?

I think this is the correct approach because latch can be set for unrelated
reasons.

The patch makes sense to me. I just recommend moving this part in front of the
loop because elapsed_min does not have to be re-computed each time:
   /* How much should have elapsed at minimum? */   elapsed_min = elapsed_min_unit * (throttling_counter /
throttling_sample);

And also a comment explaining the purpose of the loop would be
appreciated. Thanks.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



Re: [HACKERS] pg_basebackup throttling doesn't throttle as promised

From
Michael Paquier
Date:
On Sat, Sep 2, 2017 at 6:42 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
> I'm attaching a patch for each option.  Each one independently solves the
> problem.  But I think we should do both.  There is no point in issuing
> unnecessary kill system calls, and there may also be more spurious wake-ups
> than just these ones.

I agree that 1) and 2) are sensible things to do now. At some point I
think that we will want do_pg_stop_backup() to use a wait event
instead of a pg_usleep call when waiting for a segment to be archived,
in which case we could use WalSndWakeup() to set the latch and
accelerate things. So it would be nice to keep track of this
possibility in the code. We could as well do that with a new API only
signalling WAL senders in backup state though.
               SpinLockAcquire(&walsnd->mutex);               latch = walsnd->latch;
+               state = walsnd->state;               SpinLockRelease(&walsnd->mutex);

-               if (latch != NULL)
+               if (latch != NULL && state != WALSNDSTATE_BACKUP)                       SetLatch(latch);
This surely meritates a comment.
-- 
Michael



Re: [HACKERS] pg_basebackup throttling doesn't throttle as promised

From
Alvaro Herrera
Date:
Jeff Janes wrote:

> I'm attaching a patch for each option.  Each one independently solves the
> problem.  But I think we should do both.  There is no point in issuing
> unnecessary kill system calls, and there may also be more spurious wake-ups
> than just these ones.

I modified patch 1 a bit -- result attached.  I would like to back-patch
this all the way back to 9.4, but there are a few conflicts due to
c29aff959dc so I think I'm going to do pg10 only unless I get some votes
that it should go further back.  I think it should be fairly trivial to
resolve.

The other patch should be applied to master only IMO.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pg_basebackup throttling doesn't throttle as promised

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:
> Jeff Janes wrote:
> 
> > I'm attaching a patch for each option.  Each one independently solves the
> > problem.  But I think we should do both.  There is no point in issuing
> > unnecessary kill system calls, and there may also be more spurious wake-ups
> > than just these ones.
> 
> I modified patch 1 a bit -- result attached.  I would like to back-patch
> this all the way back to 9.4, but there are a few conflicts due to
> c29aff959dc so I think I'm going to do pg10 only unless I get some votes
> that it should go further back.  I think it should be fairly trivial to
> resolve.

Pushed patch 1 to master and pg10.

Please add the other one to commitfest.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services