Thread: [PERFORM] Questionaire: Common WAL write rates on busy servers.

[PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Andres Freund

Date:

25 April 2017, 07:17:43

Hi,

I've lately seen more and more installations where the generation of
write-ahead-log (WAL) is one of the primary bottlenecks.  I'm curious
whether that's primarily a "sampling error" of mine, or whether that's
indeed more common.

The primary reason I'm curious is that I'm pondering a few potential
optimizations, and would like to have some guidance which are more and
which are less important.

Questions (answer as many you can comfortably answer):
- How many MB/s, segments/s do you see on busier servers?
- What generates the bulk of WAL on your servers (9.5+ can use
  pg_xlogdump --stats to compute that)?
- Are you seeing WAL writes being a bottleneck?OA
- What kind of backup methods are you using and is the WAL volume a
  problem?
- What kind of replication are you using and is the WAL volume a
  problem?
- What are your settings for wal_compression, max_wal_size (9.5+) /
  checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?
- Could you quickly describe your workload?

Feel free to add any information you think is pertinent ;)

Greetings,

Andres Freund

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Vladimir Borodin

Date:

25 April 2017, 10:56:14

Hi Andres.

25 апр. 2017 г., в 7:17, Andres Freund <andres@anarazel.de> написал(а):

Hi,

I've lately seen more and more installations where the generation of
write-ahead-log (WAL) is one of the primary bottlenecks. I'm curious
whether that's primarily a "sampling error" of mine, or whether that's
indeed more common.

The primary reason I'm curious is that I'm pondering a few potential
optimizations, and would like to have some guidance which are more and
which are less important.

Questions (answer as many you can comfortably answer):
- How many MB/s, segments/s do you see on busier servers?

Nearly one WAL (16 MB) per second most of the time and 3 WALs per second in the beginning of checkpoint (due to full_page_writes).

- What generates the bulk of WAL on your servers (9.5+ can use
pg_xlogdump --stats to compute that)?

Here is the output from a couple of our masters (and that is actually two hours before peak load):

$ pg_xlogdump --stats 0000000100012B2800000089 0000000100012B3000000088 | fgrep -v 0.00

Type N (%) Record size (%) FPI size (%) Combined size (%)

---- - --- ----------- --- -------- --- ------------- ---

Heap2 55820638 ( 21.31) 1730485085 ( 22.27) 1385795249 ( 13.28) 3116280334 ( 17.12)

Heap 74366993 ( 28.39) 2288644932 ( 29.46) 5880717650 ( 56.34) 8169362582 ( 44.87)

Btree 84655827 ( 32.32) 2243526276 ( 28.88) 3170518879 ( 30.38) 5414045155 ( 29.74)

-------- -------- -------- --------

Total 261933790 7769663301 [42.67%] 10437031778 [57.33%] 18206695079 [100%]

$ pg_xlogdump --stats 000000010000D17F000000A5 000000010000D19100000004 | fgrep -v 0.00

Type N (%) Record size (%) FPI size (%) Combined size (%)

---- - --- ----------- --- -------- --- ------------- ---

Heap2 13676881 ( 18.95) 422289539 ( 19.97) 15319927851 ( 25.63) 15742217390 ( 25.44)

Heap 22284283 ( 30.88) 715293050 ( 33.83) 17119265188 ( 28.64) 17834558238 ( 28.82)

Btree 27640155 ( 38.30) 725674896 ( 34.32) 19244109632 ( 32.19) 19969784528 ( 32.27)

Gin 6580760 ( 9.12) 172246586 ( 8.15) 8091332009 ( 13.54) 8263578595 ( 13.35)

-------- -------- -------- --------

Total 72172983 2114133847 [3.42%] 59774634680 [96.58%] 61888768527 [100%]

- Are you seeing WAL writes being a bottleneck?OA

We do sometimes see WALWriteLock in pg_stat_activity.wait_event, but not too often.

- What kind of backup methods are you using and is the WAL volume a
problem?

We use fork of barman project. In most cases that’s not a problem.

- What kind of replication are you using and is the WAL volume a
problem?

Physical streaming replication. We used to have problems with network bandwidth (1 Gbit/s was consumed by transferring WAL to two replicas and one archive) but that became better after 1. upgrading to 9.5 and turning wal_compression on, 2. changing archive command to doing parallel compression and sending WALs to archive, 3. increasing checkpoint_timeout.

- What are your settings for wal_compression, max_wal_size (9.5+) /
checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?

xdb301e/postgres M # SELECT name, current_setting(name) FROM pg_settings

WHERE name IN ('max_wal_size', 'checkpoint_timeout', 'wal_compression', 'wal_buffers');

name | current_setting

--------------------+-----------------

checkpoint_timeout | 1h

max_wal_size | 128GB

wal_buffers | 16MB

wal_compression | on

(4 rows)

Time: 0.938 ms

xdb301e/postgres M #

- Could you quickly describe your workload?

OLTP workload with 80% reads and 20% writes.

Feel free to add any information you think is pertinent ;)

Well, we actually workarounded issues with WAL write rate by increasing checkpoint_timeout to maximum possible (in 9.6 it can be even more). The downside of this change is recovery time. Thanks postgres for its stability but sometimes you can waste ~ 10 minutes just to restart postgres for upgrading to new minor version and that’s not really cool.

Greetings,

Andres Freund

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] [GENERAL] Questionaire: Common WAL write rates on busy servers.

From

bricklen

Date:

25 April 2017, 17:19:31

On Mon, Apr 24, 2017 at 9:17 PM, Andres Freund <andres@anarazel.de> wrote:

Questions (answer as many you can comfortably answer):
- How many MB/s, segments/s do you see on busier servers?
- What generates the bulk of WAL on your servers (9.5+ can use
pg_xlogdump --stats to compute that)?
- Are you seeing WAL writes being a bottleneck?OA
- What kind of backup methods are you using and is the WAL volume a
problem?
- What kind of replication are you using and is the WAL volume a
problem?
- What are your settings for wal_compression, max_wal_size (9.5+) /
checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?
- Could you quickly describe your workload?

* Postgresql 9.3

* 1500+ db servers

* Daily peak for busy databases: 75 WALs switched per second (less than 10% of the servers experience this)

* Avg per db: 2 WALs/s
* Mainly generated by large batch sync processes that occur throughout the day, and by a legacy archiving process to purge older data (potentially many millions of cascading deletes).
*Half the servers have (encrypted) pg_dump backups, WAL volume hasn't proved to be a problem there, though dump size is a problem for a few of the larger databases (less than 1TB).

* Inter-data-centre replication is all streaming, across DC's (over the WAN) WAL shipping is over compressed SSH tunnels.

Occasionally the streaming replication falls behind, but more commonly it is the cross-DC log shipping that becomes a problem. Some of the servers will generate 50+ GBs of WAL in a matter of minutes and that backs up immediately on the masters. Occasionally this has a knock-on effect for other servers and slows down their log shipping due to network saturation.

* checkpoint_segments: 64, checkpoint_timeout: 5 mins, wal_buffers: 16MB

Workload:

70% of servers are generally quiet, with occasional bursty reads and writes.

20% are medium use, avg a few hundred transactions/second

10% average around 5k txns/s, with bursts up to 25k txns/s for several minutes.

All servers have about 80% reads / 20% writes, though those numbers flip during big sync jobs and when the purging maintenance kicks off.

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Claudio Freire

Date:

25 April 2017, 19:07:41

On Tue, Apr 25, 2017 at 1:17 AM, Andres Freund <andres@anarazel.de> wrote:
> Questions (answer as many you can comfortably answer):
> - How many MB/s, segments/s do you see on busier servers?

~20MB/s with FPW compression, with peaks of ~35MB/s. Writes become the
bottleneck without compression and it tops at about 40-50MB/s, WAL
archiving cannot keep up beyond that point.

> - What generates the bulk of WAL on your servers (9.5+ can use
>   pg_xlogdump --stats to compute that)?

Type                                           N      (%)
Record size      (%)             FPI size      (%)        Combined
size      (%)
----                                           -      ---
-----------      ---             --------      ---
-------------      ---
XLOG                                           0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
Transaction                                   30 (  0.00)
    960 (  0.00)                    0 (  0.00)                  960 (
0.00)
Storage                                        0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
CLOG                                           0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
Database                                       0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
Tablespace                                     0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
MultiXact                                    110 (  0.01)
   7456 (  0.02)                    0 (  0.00)                 7456 (
0.00)
RelMap                                         0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
Standby                                        2 (  0.00)
    368 (  0.00)                    0 (  0.00)                  368 (
0.00)
Heap2                                       2521 (  0.22)
  78752 (  0.24)              4656133 (  2.82)              4734885 (
2.39)
Heap                                      539419 ( 46.52)
15646903 ( 47.14)             98720258 ( 59.87)            114367161 (
57.73)
Btree                                     606573 ( 52.31)
15872182 ( 47.82)             57514798 ( 34.88)             73386980 (
37.05)
Hash                                           0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
Gin                                         2866 (  0.25)
 134330 (  0.40)              4012251 (  2.43)              4146581 (
2.09)
Gist                                           0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
Sequence                                    7970 (  0.69)
1450540 (  4.37)                    0 (  0.00)              1450540 (
0.73)
SPGist                                         0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
BRIN                                           0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
CommitTs                                       0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
ReplicationOrigin                              0 (  0.00)
      0 (  0.00)                    0 (  0.00)                    0 (
0.00)
                                        --------
--------                      --------                      --------
Total                                    1159491
33191491 [16.76%]            164903440 [83.24%]            198094931
[100%]


> - Are you seeing WAL writes being a bottleneck?OA

Sometimes, more so without FPW compression

> - What kind of backup methods are you using and is the WAL volume a
>   problem?
> - What kind of replication are you using and is the WAL volume a
>   problem?

Streaming to hot standby + WAL archiving, delayed standby as backup
and PITR. Backups are regular filesystem-level snapshots of the
delayed standby (with postgres down to get consistent snapshots).

WAL volume getting full during periods where the hot standby lags
behind (or when we have to stop it to create consistent snapshots) are
an issue indeed, and we've had to provision significant storage to be
able to absorb those peaks (1TB of WAL)

We bundle WAL segments into groups of 256 segments for archiving and
recovery to minimize the impact of TCP slow start. We further gzip
segments before transfer with pigz, and we use mostly rsync (with a
wrapper script that takes care of durability and error handling) to
move segments around. Getting the archive/recovery scripts to handle
the load hasn't been trivial.

> - What are your settings for wal_compression, max_wal_size (9.5+) /
>   checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?

wal_compression = on
max_wal_size = 12GB
min_wal_size = 2GB
checkpoint_timeout = 30min
wal_buffers = -1  (16MB effective)

> - Could you quickly describe your workload?

Steady stream of (preaggregated) input events plus upserts into ~12
partitioned aggregate "matviews" (within quotes since they're manually
maintained up to date).

Input rate is approximately 9000 rows/s without counting the upserts
onto the aggregate matviews. Old information is regularly compressed
and archived into less detailed partitions for a steady size of about
5TB.

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Andres Freund

Date:

27 April 2017, 18:59:06

Hi,

On 2017-04-24 21:17:43 -0700, Andres Freund wrote:
> I've lately seen more and more installations where the generation of
> write-ahead-log (WAL) is one of the primary bottlenecks.  I'm curious
> whether that's primarily a "sampling error" of mine, or whether that's
> indeed more common.
>
> The primary reason I'm curious is that I'm pondering a few potential
> optimizations, and would like to have some guidance which are more and
> which are less important.
>
> Questions (answer as many you can comfortably answer):
> - How many MB/s, segments/s do you see on busier servers?
> - What generates the bulk of WAL on your servers (9.5+ can use
>   pg_xlogdump --stats to compute that)?
> - Are you seeing WAL writes being a bottleneck?OA
> - What kind of backup methods are you using and is the WAL volume a
>   problem?
> - What kind of replication are you using and is the WAL volume a
>   problem?
> - What are your settings for wal_compression, max_wal_size (9.5+) /
>   checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?
> - Could you quickly describe your workload?

Ok, based on the, few, answers I've got so far, my experience is indeed
skewed.  A number of the PG users I interacted with over the last couple
years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
(max I'veseen).  At that point WAL insertion became a major bottleneck,
even if storage was more than fast enough to keep up.  To address these
we'd need some changes, but the feedback so far suggest that it's not
yet a widespread issue...

- Andres

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

"Joshua D. Drake"

Date:

27 April 2017, 19:31:34

On 04/27/2017 08:59 AM, Andres Freund wrote:

>
> Ok, based on the, few, answers I've got so far, my experience is indeed
> skewed.  A number of the PG users I interacted with over the last couple
> years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
> (max I'veseen).  At that point WAL insertion became a major bottleneck,
> even if storage was more than fast enough to keep up.  To address these
> we'd need some changes, but the feedback so far suggest that it's not
> yet a widespread issue...

I would agree it isn't yet a widespread issue.

The only people that are likely going to see this are going to be on
bare metal. We should definitely plan on that issue for say 11. I do
have a question though, where you have seen this issue is it with
synchronous_commit on or off?

Thanks,

JD



--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Andres Freund

Date:

27 April 2017, 19:34:07

On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
> On 04/27/2017 08:59 AM, Andres Freund wrote:
>
> >
> > Ok, based on the, few, answers I've got so far, my experience is indeed
> > skewed.  A number of the PG users I interacted with over the last couple
> > years had WAL write ranges somewhere in the range of 500MB/s to 2.2GB/s
> > (max I'veseen).  At that point WAL insertion became a major bottleneck,
> > even if storage was more than fast enough to keep up.  To address these
> > we'd need some changes, but the feedback so far suggest that it's not
> > yet a widespread issue...
>
> I would agree it isn't yet a widespread issue.

I'm not yet sure about that actually.  I suspect a large percentage of
people with such workloads aren't lingering lots on the lists.


> The only people that are likely going to see this are going to be on bare
> metal. We should definitely plan on that issue for say 11.

"plan on that issue" - heh. We're talking about major engineering
projects here ;)


> I do have a question though, where you have seen this issue is it with
> synchronous_commit on or off?

Both. Whether that matters or not really depends on the workload. If you
have bulk writes, it doesn't really matter much.

- Andres

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

"Joshua D. Drake"

Date:

27 April 2017, 20:29:48

On 04/27/2017 09:34 AM, Andres Freund wrote:
> On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
>> On 04/27/2017 08:59 AM, Andres Freund wrote:
>>

>> I would agree it isn't yet a widespread issue.
>
> I'm not yet sure about that actually.  I suspect a large percentage of
> people with such workloads aren't lingering lots on the lists.

That would probably be true. I was thinking of it more as the "most new
users are in the cloud" and the "cloud" is going to be rare that a cloud
user is going to be able to hit that level of writes. (at least not
without spending LOTS of money)

>
>
>> The only people that are likely going to see this are going to be on bare
>> metal. We should definitely plan on that issue for say 11.
>
> "plan on that issue" - heh. We're talking about major engineering
> projects here ;)

Sorry, wasn't trying to make light of the effort. :D

>
>
>> I do have a question though, where you have seen this issue is it with
>> synchronous_commit on or off?
>
> Both. Whether that matters or not really depends on the workload. If you
> have bulk writes, it doesn't really matter much.

Sure, o.k.

Thanks,

Andres

>
> - Andres
>


--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Andres Freund

Date:

27 April 2017, 20:35:40

On 2017-04-27 10:29:48 -0700, Joshua D. Drake wrote:
> On 04/27/2017 09:34 AM, Andres Freund wrote:
> > On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
> > > On 04/27/2017 08:59 AM, Andres Freund wrote:
> > >
>
> > > I would agree it isn't yet a widespread issue.
> >
> > I'm not yet sure about that actually.  I suspect a large percentage of
> > people with such workloads aren't lingering lots on the lists.
>
> That would probably be true. I was thinking of it more as the "most new
> users are in the cloud" and the "cloud" is going to be rare that a cloud
> user is going to be able to hit that level of writes. (at least not without
> spending LOTS of money)

You can get pretty decent NVMe SSD drives on serveral cloud providers
these days, without immediately bancrupting you.  Sure, it's instance
storage, but with a decent replication and archival setup, that's not
necessarily an issue.

It's not that hard to get to the point where postgres can't keep up with
storage, at least for some workloads.

- Andres

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Tomas Vondra

Date:

28 April 2017, 01:53:27

Hi,

On 04/25/2017 06:17 AM, Andres Freund wrote:
> Hi,
>
> I've lately seen more and more installations where the generation of
> write-ahead-log (WAL) is one of the primary bottlenecks. I'm curious
> whether that's primarily a "sampling error" of mine, or whether
> that's indeed more common.
>

I see those cases too. To some degree it's a sampling bias. People
generally don't call us to look at the 99% of systems that perform fine,
so we tend to see the the 1% of systems under pressure.

That doesn't make that observation irrelevant, though. Those demanding
systems are one of the things that pushes us forward.

 >
> The primary reason I'm curious is that I'm pondering a few potential
> optimizations, and would like to have some guidance which are more
> and which are less important.
>

I think any optimization you do will improve at least some of those busy
systems.

> Questions (answer as many you can comfortably answer):
> - How many MB/s, segments/s do you see on busier servers?

That depends on the cause (see the next point).

> - What generates the bulk of WAL on your servers (9.5+ can use
>   pg_xlogdump --stats to compute that)?

a) systems doing large batches
    - bulk loads/updates/deletes, one or few sessions doing a lot
    - easily high hundreds of MB/s (on a separate device)

b) OLTP systems doing a lot of tiny/small transactions
    - many concurrent sessions
    - often end up much more limited by WAL, due to locking etc.
    - often the trouble is random updates all over the place, causing
      amplification due to FPI (PK on UUID is a great way to cause this
      unnecessarily even on apps with naturally tiny working set)

> - Are you seeing WAL writes being a bottleneck?OA

On the write-intensive systems, yes. Often the CPUs are waiting for WAL
I/O to complete during COMMIT.

> - What kind of backup methods are you using and is the WAL volume a
>   problem?

The large and busy systems can easily produce so much WAL, that the
basebackup is not the largest part of the backup. That is somewhat
solvable by using other means of obtaining the basebackup snapshot (e.g.
by taking some sort of filesystem / SAN / ... snapshot). That reduces
the amount of WAL needed to make the basebackup consistent, but it
doesn't solve the WAL archiving issue.

> - What kind of replication are you using and is the WAL volume a
>   problem?

Generally streaming replication, and yes, the amount of WAL may be an
issue, partially because the standby is a single-process thing. And as
it has to process something generated by N sessions on the primary, that
can't end well.

Interestingly enough, FPIs can actually make it way faster, because the
standby does not need to read the data from disk during recovery.

> - What are your settings for wal_compression, max_wal_size (9.5+) /
>   checkpoint_segments (< 9.5), checkpoint_timeout and wal_buffers?

I'd say the usual practice is to tune for timed checkpoints, say 30+
minutes apart (or more). wal_compression is typically 'off' (i.e. the
default value).

> - Could you quickly describe your workload?

Pretty much a little bit of everything, depending on the customer.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Tomas Vondra

Date:

28 April 2017, 02:18:59

On 04/27/2017 06:34 PM, Andres Freund wrote:
> On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
>> On 04/27/2017 08:59 AM, Andres Freund wrote:
>>
>>>
>>> Ok, based on the, few, answers I've got so far, my experience is
>>> indeed skewed. A number of the PG users I interacted with over
>>> the last couple years had WAL write ranges somewhere in the range
>>> of 500MB/s to 2.2GB/s (max I'veseen). At that point WAL insertion
>>> became a major bottleneck, even if storage was more than fast
>>> enough to keep up. To address these we'd need some changes, but
>>> the feedback so far suggest that it's not yet a widespread
>>> issue...
>>
>> I would agree it isn't yet a widespread issue.
>
> I'm not yet sure about that actually. I suspect a large percentage
> of people with such workloads aren't lingering lots on the lists.
>

To a certain extent, this is a self-fulfilling prophecy. If you know
you'll have such a busy system, you probably do some research and
testing first, before choosing the database. If we don't perform well
enough, you pick something else. Which removes the data point.

Obviously, there are systems that start small and get busier and busier
over time. And those are the ones we see.

cheers

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Tomas Vondra

Date:

28 April 2017, 02:29:14

On 04/27/2017 07:35 PM, Andres Freund wrote:
> On 2017-04-27 10:29:48 -0700, Joshua D. Drake wrote:
>> On 04/27/2017 09:34 AM, Andres Freund wrote:
>>> On 2017-04-27 09:31:34 -0700, Joshua D. Drake wrote:
>>>> On 04/27/2017 08:59 AM, Andres Freund wrote:
>>>>
>>
>>>> I would agree it isn't yet a widespread issue.
>>>
>>> I'm not yet sure about that actually. I suspect a large
>>> percentage of people with such workloads aren't lingering lots on
>>> the lists.
>>
>> That would probably be true. I was thinking of it more as the
>> "most new users are in the cloud" and the "cloud" is going to be
>> rare that a cloud user is going to be able to hit that level of
>> writes. (at least not without spending LOTS of money)
>
> You can get pretty decent NVMe SSD drives on serveral cloud
> providers these days, without immediately bancrupting you. Sure, it's
> instance storage, but with a decent replication and archival setup,
> that's not necessarily an issue.
>
> It's not that hard to get to the point where postgres can't keep up
> with storage, at least for some workloads.
>

I can confirm this observation. I bought the Intel 750 NVMe SSD last
year, the device has 1GB DDR3 cache on it (power-loss protected), can do
~1GB/s of sustained O_DIRECT sequential writes. But when running
pgbench, I can't push more than ~300MB/s of WAL to it, no matter what I
do because of WALWriteLock.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Andres Freund

Date:

28 April 2017, 02:34:33

On 2017-04-28 01:29:14 +0200, Tomas Vondra wrote:
> I can confirm this observation. I bought the Intel 750 NVMe SSD last year,
> the device has 1GB DDR3 cache on it (power-loss protected), can do ~1GB/s of
> sustained O_DIRECT sequential writes. But when running pgbench, I can't push
> more than ~300MB/s of WAL to it, no matter what I do because of
> WALWriteLock.

Hm, interesting.  Even if you up wal_buffers to 128MB, use
synchronous_commit = off, and play with wal_writer_delay/flush_after?

- Andres

Re: [PERFORM] Questionaire: Common WAL write rates on busy servers.

From

Tomas Vondra

Date:

29 April 2017, 03:41:19

On 04/28/2017 01:34 AM, Andres Freund wrote:
> On 2017-04-28 01:29:14 +0200, Tomas Vondra wrote:
>> I can confirm this observation. I bought the Intel 750 NVMe SSD last year,
>> the device has 1GB DDR3 cache on it (power-loss protected), can do ~1GB/s of
>> sustained O_DIRECT sequential writes. But when running pgbench, I can't push
>> more than ~300MB/s of WAL to it, no matter what I do because of
>> WALWriteLock.
>
> Hm, interesting.  Even if you up wal_buffers to 128MB, use
> synchronous_commit = off, and play with wal_writer_delay/flush_after?
>

I think I've tried things like that, but let me do some proper testing.
I'll report the numbers in a few days.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services