Re: [HACKERS] Replication/backup defaults - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] Replication/backup defaults
Date
Msg-id f32181af-024a-6a19-5d3c-c4fb0897b0c7@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] Replication/backup defaults  (Magnus Hagander <magnus@hagander.net>)
Responses Re: [HACKERS] Replication/backup defaults  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On 01/05/2017 02:23 PM, Magnus Hagander wrote:
>
>
> On Thu, Jan 5, 2017 at 12:44 AM, Tomas Vondra
> <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:
>
>     On 01/03/2017 11:56 PM, Tomas Vondra wrote:
>
>         Hi,
>
>     ...
>
>         I'll push results for larger ones once those tests complete
>         (possibly
>         tomorrow).
>
>
>     I just pushed additional results (from the additional scales) to the
>     git repositories. On the larger (16/32-cores) machine with 2x
>     e5-2620, the results look like this
>
>        scale         minimal       replica         logical
>       -----------------------------------------------------
>        100             23968         24393           24393
>        1000            23412         23656           23794
>        10000            5283          5320            5197
>
>     and on the smaller one (i5-2500k with 4 cores) I got this:
>
>        scale         minimal       replica         logical
>       -----------------------------------------------------
>        50               5884          5896            5873
>        400              5324          5342            5478
>        1000             5341          5439            5425
>
>     The scales were chosen so that the smallest one fits into shared
>     buffers, the medium exceeds shared buffers but still fits into RAM,
>     and the largest scale exceeds RAM.
>
>     The results seem to confirm that for this workload (regular
>     pgbench), there's very little difference between the different WAL
>     levels. Actually, the 'replica' seems a tad faster than 'minimal',
>     but the difference may be easily due to noise.
>
>     I've also looked at the amount of WAL actually produced, by doing
>     pgbench runs throttled to the same throughput, and counting the
>     number of archived WAL segments & running pg_xlogdump. Interestingly
>     enough, those two metrics differ quite a bit - for example for scale
>     1000 (on the 32-core machine), the 2h runs produced these number of
>     WAL segments:
>
>        minimal: 5515 (88.2GB)
>        replica: 5587 (89.4GB)
>        logical: 6058 (96.9GB)
>
>     so 'replica' adds ~1.3% and 'logical' ~9.8%. But per pg_xlogdump,
>     the WAL amounts are only 73.3GB, 73.9GB and 74.4GB - a difference of
>     only ~1.5% between minimal and logical. The values are also much
>     lower than raw WAL size, so I assume it's because pg_xlogdump
>     ignores some extra overhead, present in the segments. Moreover, the
>     sequential nature of WAL writes means even the +10% is not a big
>     deal (unless it results in saturating the bandwidth, but running on
>     >90% is a bad idea anyway).
>
>
> If you are using log archiving, it also means your log archive grows by
> 10% (well, 8% assuming it was 9.8% on top of 0, not on top of replica).
>

... and that the standby has to chew through the additional 10% of WAL. 
We already have standbys that occasionally struggle to keep up with the 
master, and adding more load won't make them happy (even if just 10%).

>
>
>     My conclusion from these results is that using 'wal_level=replica'
>     by default seems fine. Perhaps even wal_level=logical would be OK,
>     but that's probably a too big step for 10.0.
>
>
>
> I think it sounds like 'replica' is the safe default.
>
> If we can make it possible to go replica<->logical without a restart,
> that makes it easy enough to increase it if necessary, and the default
> still applies to most people (most people take backups, most people
> probably don't do logical replication).
>

My thoughts, exactly.

>
>
>     Any ideas how to construct a plausible workload where the
>     differences are significantly larger? Running the tests on non-SSD
>     storage might also be useful.
>
>
> It's easy enough to construct a benchmark specifically to show the
> difference, but of any actual "normal workload" for it. Typically the
> optimization applies to things like bulk loading, which typically never
> done alone and does not lend itself to that type of benchmarking very
> easily.
>

Not sure if I understand correctly what you're saying. You're saying 
that although it'd be easy to construct a benchmark showing significant 
performance impact, it won't represent a common workload. Correct?

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] [PATCH] Reload SSL certificates on SIGHUP
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] ALTER SYSTEM for pg_hba.conf