Home > mailing lists

Re: design for parallel backup - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: design for parallel backup
Date	April 21, 2020 14:18:20
Msg-id	CA+TgmoaVyYgz06ntz_O0ne-0vVSq9tXy0d+H6NbowCs62fVKXA@mail.gmail.com Whole thread Raw
In response to	Re: design for parallel backup (Andres Freund <andres@anarazel.de>)
Responses	Re: design for parallel backup (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

On Tue, Apr 21, 2020 at 2:44 AM Andres Freund <andres@anarazel.de> wrote:
> FWIW, I just tested pg_basebackup locally.
>
> Without compression and a stock postgres I get:
> unix                tcp                  tcp+ssl:
> 1.74GiB/s           1.02GiB/s            699MiB/s
>
> That turns out to be bottlenecked by the backup manifest generation.

Whoa. That's unexpected, at least for me. Is that because of the
CRC-32C overhead, or something else? What do you get with
--manifest-checksums=none?

> Without compression and a stock postgres I get, and --no-manifest
> unix                tcp                  tcp+ssl:
> 2.51GiB/s           1.63GiB/s            1.00GiB/s
>
> I.e. all of them area already above 10Gbit/s network.
>
> Looking at a profile it's clear that our small output buffer is the
> bottleneck:
> 64kB Buffers + --no-manifest:
> unix                tcp                  tcp+ssl:
> 2.99GiB/s           2.56GiB/s            1.18GiB/s
>
> At this point the backend is not actually the bottleneck anymore,
> instead it's pg_basebackup. Which is in part due to the small buffer
> used for output data (i.e. libc's FILE buffering), and in part because
> we spend too much time memmove()ing data, because of the "left-justify"
> logic in pqCheckInBufferSpace().

Hmm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Amit Kapila
Date: 21 April 2020, 13:40:50
Subject: Re: PG compilation error with Visual Studio 2015/2017/2019

From: Dilip Kumar
Date: 21 April 2020, 14:22:44
Subject: Re: fixing old_snapshot_threshold's time->xid mapping

Re: design for parallel backup - Mailing list pgsql-hackers

Previous

Next