Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery - Mailing list pgsql-general

From Dmitry Koterov
Subject Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery
Date
Msg-id d7df81620708220828y123c03n48ca23e3457f5d2@mail.gmail.com
Whole thread Raw
Responses Re: Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Re: Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery  (Greg Smith <gsmith@gregsmith.com>)
Re: Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery  (Greg Smith <gsmith@gregsmith.com>)
Re: Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery  (Lincoln Yeoh <lyeoh@pop.jaring.my>)
List pgsql-general
Hello.

We are trying to use HP CISS contoller (Smart Array E200i) with internal cache memory (100M for write caching, built-in power battery) together with Postgres. Typically under a heavy load Postgres runs checkpoint fsync very slow:

checkpoint buffers dirty=16.8 MB (3.3%) write=24.3 ms sync=6243.3 ms

(If we turn off fsync, the speed increases greatly, fsync=0.) And unfortunately it affects all the database productivity during the checkpoint.
Here is the timing (in milliseconds) of a test transaction called multiple times concurrently (6 threads) with fsync turned ON:

40.4
44.4
37.4
44.0
42.7
41.8
218.1
254.2
101.0
42.2
42.4
41.0
39.5

(you may see a significant slowdown during a checkpoint).
Here is dstat disc write activity log for that test:

   0
   0
 284k
   0
   0
  84k
   0
   0
 276k
  37M
  208k
   0
   0
   0
   0
 156k
   0
   0
   0
   0

I have written a small perl script to check how slow is fsync for Smart Array E200i controller. Theoretically, because of write cache, fsync MUST cost nothing, but in practice it is not true:

# cd /mnt/c0d1p1/
# perl -e 'use Time::HiRes qw(gettimeofday tv_interval); system "sync"; open F, ">bulk"; print F ("a" x (1024 * 1024 * 20)); close F; $t0=[gettimeofday]; system "sync"; print ">>> fsync took " . tv_interval ( $t0, [gettimeofday]) . " s\n"; unlink "bulk"'
>>> fsync took 0.247033 s

You see, 50M block was fsynced for 0.25 s.

The question is: how to solve this problem and make fsync run with no delay. Seems to me that controller's internal write cache is not used (strange, because all configuration options are fine), but how to check it? Or, maybe, there is another side-effect?

pgsql-general by date:

Previous
From: Jeff Amiel
Date:
Subject: Re: could not open file "pg_clog/0BFF"
Next
From: "Scott Marlowe"
Date:
Subject: Re: Postgres, fsync and RAID controller with 100M of internal cache & dedicated battery