Re: design for parallel backup - Mailing list pgsql-hackers

From Robert Haas
Subject Re: design for parallel backup
Date
Msg-id CA+TgmoaVyYgz06ntz_O0ne-0vVSq9tXy0d+H6NbowCs62fVKXA@mail.gmail.com
Whole thread Raw
In response to Re: design for parallel backup  (Andres Freund <andres@anarazel.de>)
Responses Re: design for parallel backup  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Apr 21, 2020 at 2:44 AM Andres Freund <andres@anarazel.de> wrote:
> FWIW, I just tested pg_basebackup locally.
>
> Without compression and a stock postgres I get:
> unix                tcp                  tcp+ssl:
> 1.74GiB/s           1.02GiB/s            699MiB/s
>
> That turns out to be bottlenecked by the backup manifest generation.

Whoa. That's unexpected, at least for me. Is that because of the
CRC-32C overhead, or something else? What do you get with
--manifest-checksums=none?

> Without compression and a stock postgres I get, and --no-manifest
> unix                tcp                  tcp+ssl:
> 2.51GiB/s           1.63GiB/s            1.00GiB/s
>
> I.e. all of them area already above 10Gbit/s network.
>
> Looking at a profile it's clear that our small output buffer is the
> bottleneck:
> 64kB Buffers + --no-manifest:
> unix                tcp                  tcp+ssl:
> 2.99GiB/s           2.56GiB/s            1.18GiB/s
>
> At this point the backend is not actually the bottleneck anymore,
> instead it's pg_basebackup. Which is in part due to the small buffer
> used for output data (i.e. libc's FILE buffering), and in part because
> we spend too much time memmove()ing data, because of the "left-justify"
> logic in pqCheckInBufferSpace().

Hmm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: PG compilation error with Visual Studio 2015/2017/2019
Next
From: Dilip Kumar
Date:
Subject: Re: fixing old_snapshot_threshold's time->xid mapping