Re: Backup throttling - Mailing list pgsql-hackers

From Boszormenyi Zoltan
Subject Re: Backup throttling
Date
Msg-id 529C89BF.3000702@cybertec.at
Whole thread Raw
In response to Re: Backup throttling  (Antonin Houska <antonin.houska@gmail.com>)
Responses Re: Backup throttling
List pgsql-hackers
Hi,

I am reviewing your patch.

2013-10-10 15:32 keltezéssel, Antonin Houska írta:
> On 10/09/2013 08:56 PM, Robert Haas wrote:
>> There seem to be several review comments made since you posted this
>> version.  I'll mark this Waiting on Author in the CommitFest
>> application, since it seems that the patch needs to be further
>> updated.
> Since it was waiting for reviewer, I was not sure whether I should wait
> for his findings and fix everything in a single transaction.
> Nevertheless, the next version is attached here.
>
> It fixes findings reported in
> http://www.postgresql.org/message-id/20130903165652.GC5227@eldon.alvh.no-ip.org
>
> As for units
> http://www.postgresql.org/message-id/20130903224127.GD7018@awork2.anarazel.de
> I'm not convinced about "MB" at the moment. Unfortunately I couldn't
> find any other command-line PG utility requiring amount of data as an
> option. Thus I use single-letter suffix, just as wget does for the same
> purposes.
>
> As for this
> http://www.postgresql.org/message-id/20130903125155.GA18486@awork2.anarazel.de
> there eventually seems to be a consensus (I notice now the discussion
> was off-list):
>
>> On 2013-09-03 23:21:57 +0200, Antonin Houska wrote:
>>> On 09/03/2013 02:51 PM, Andres Freund wrote:
>>>
>>>> It's probably better to use latches for the waiting, those have properly
>>>> defined interruption semantics. Whether pg_usleep will be interrupted is
>>>> platform dependant...
>>> I noticed a comment about interruptions around the definition of
>>> pg_usleep() function, but concluded that the sleeps are rather frequent
>>> in this applications (typically in the order of tens to hundreds per
>>> second, although the maximum of 256 might need to be decreased).
>>> Therefore occasional interruptions shouldn't distort the actual rate
>>> much. I'll think about it again. Thanks,
>> The issue is rather that you might not get woken up when you want to
>> be. Signal handlers in postgres tend to do a
>> SetLatch(&MyProc->procLatch); which then will interrupt sleeps done via
>> WaitLatch(). It's probably not that important with the duration of the
>> sleeps you have.
>>
>> Greetings,
>>
>> Andres Freund
> // Antonin Houska (Tony)

* Is the patch in a patch format which has context?

Yes

* Does it apply cleanly to the current git master?

It applies with some offsets. Version "3a" that applies cleanly is attached.

* Does it include reasonable tests, necessary doc patches, etc?

Docs: yes. Tests: N/A

* Does the patch actually implement what it advertises?

Yes.

* Do we want that?

Yes.

* Do we already have it?

No.

* Does it follow SQL spec, or the community-agreed behavior?

No such SQL spec. The latest patch fixed all previously raised comments.

* Does it include pg_dump support (if applicable)?

N/A

* Are there dangers?

Yes, the "danger" of slowing down taking a base backup.
But this is the point.

* Have all the bases been covered?

Yes.

* Does the feature work as advertised?

Yes.

* Are there corner cases the author has failed to consider?

No.

* Are there any assertion failures or crashes?

No.

* Does the patch slow down simple tests?

No.

* If it claims to improve performance, does it?

N/A

* Does it slow down other things?

No.

* Does it follow the project coding guidelines?

Yes. A nitpicking: this else branch below might need brackets
because there is also a comment in that branch:

+                       /* The 'real data' starts now (header was ignored). */
+                       throttled_last = GetCurrentIntegerTimestamp();
+               }
+               else
+                       /* Disable throttling. */
+                       throttling_counter = -1;
+

* Are there portability issues?

No.

* Will it work on Windows/BSD etc?

It should, there are no dangerous calls besides the above mentioned
pg_usleep(). But waking up from pg_usleep() earlier makes rate limiting
fluctuate slightly, not fail.

* Are the comments sufficient and accurate?

Yes.

* Does it do what it says, correctly?

Yes.

Although it should be mentioned in the docs that rate limiting
applies to walsenders individually, not globally. I tried this
on a freshly created database:

$ time pg_basebackup -D data2 -r 1M -X stream -h 127.0.0.1
real    0m26.508s
user    0m0.054s
sys    0m0.360s

The source database had 3 WAL files in pg_xlog, one of them was
also streamed. The final size of "data2" was 43MB or 26MB without pg_xlog,
i.e. without the "-X stream" option. The backup used 2 walsenders
in parallel (one for WAL) which is a known feature.

* Does it produce compiler warnings?

No.

*Can you make it crash?

No.

Consider the changes to the code in the context of the project as a whole:

* Is everything done in a way that fits together coherently with other features/modules?

Yes.

* Are there interdependencies that can cause problems?

No.

Another note. This chunk should be submitted separately as a comment bugfix:

diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c
index c3c71b7..5736fd8 100644
--- a/src/backend/utils/adt/timestamp.c
+++ b/src/backend/utils/adt/timestamp.c
@@ -1288,7 +1288,7 @@ GetCurrentTimestamp(void)
  /*
   * GetCurrentIntegerTimestamp -- get the current operating system time as int64
   *
- * Result is the number of milliseconds since the Postgres epoch. If compiled
+ * Result is the number of microseconds since the Postgres epoch. If compiled
   * with --enable-integer-datetimes, this is identical to GetCurrentTimestamp(),
   * and is implemented as a macro.
   */

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
      http://www.postgresql.at/


Attachment

pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: DATE type output does not follow datestyle parameter
Next
From: Dimitri Fontaine
Date:
Subject: Re: Extension Templates S03E11