Re: Testing Sandforce SSD - Mailing list pgsql-performance

From Yeb Havinga
Subject Re: Testing Sandforce SSD
Date
Msg-id 4C4C83DF.3010703@gmail.com
Whole thread Raw
In response to Re: Testing Sandforce SSD  (Greg Smith <greg@2ndquadrant.com>)
Responses Re: Testing Sandforce SSD  (Yeb Havinga <yebhavinga@gmail.com>)
Re: Testing Sandforce SSD  (Yeb Havinga <yebhavinga@gmail.com>)
List pgsql-performance
Greg Smith wrote:
> Put it on ext3, toggle on noatime, and move on to testing.  The
> overhead of the metadata writes is the least of the problems when
> doing write-heavy stuff on Linux.
I ran a pgbench run and power failure test during pgbench with a 3 year
old computer

8GB DDR ?
Intel Core 2 duo 6600 @ 2.40GHz
Intel Corporation 82801IB (ICH9) 2 port SATA IDE Controller
64 bit 2.6.31-22-server (Ubuntu karmic), kernel option elevator=deadline
sysctl options besides increasing shm:
fs.file-max=327679
fs.aio-max-nr=3145728
vm.swappiness=0
vm.dirty_background_ratio = 3
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_ratio = 15

Filesystem on SSD with postgresql data: ext3 mounted with
noatime,nodiratime,relatime
Postgresql cluster: did initdb with C locale. Data and pg_xlog together
on the same ext3 filesystem.

Changed in postgresql.conf: settings with pgtune for OLTP and 15 connections
maintenance_work_mem = 480MB # pgtune wizard 2010-07-25
checkpoint_completion_target = 0.9 # pgtune wizard 2010-07-25
effective_cache_size = 5632MB # pgtune wizard 2010-07-25
work_mem = 512MB # pgtune wizard 2010-07-25
wal_buffers = 8MB # pgtune wizard 2010-07-25
checkpoint_segments = 31 # pgtune said 16 here
shared_buffers = 1920MB # pgtune wizard 2010-07-25
max_connections = 15 # pgtune wizard 2010-07-25

Initialized with scale 800 with is about 12GB. I especially went beyond
an in RAM size for this machine (that would be ~ 5GB), so random reads
would weigh in the result. Then let pgbench run the tcp benchmark with
-M prepared, 10 clients and -T 3600 (one hour) and 10 clients, after
that loaded the logfile in a db and did some queries. Then realized the
pgbench result page was not in screen buffer anymore so I cannot copy it
here, but hey, those can be edited as well right ;-)

select count(*),count(*)/3600,avg(time),stddev(time) from log;
  count  | ?column? |          avg          |     stddev
---------+----------+-----------------------+----------------
 4939212 |     1372 | 7282.8581978258880161 | 11253.96967962
(1 row)

Judging from the latencys in the logfiles I did not experience serious
lagging (time is in microseconds):

select * from log order by time desc limit 3;
 client_id | tx_no |  time   | file_no |   epoch    | time_us
-----------+-------+---------+---------+------------+---------
         3 | 33100 | 1229503 |       0 | 1280060345 |  866650
         9 | 39990 | 1077519 |       0 | 1280060345 |  858702
         2 | 55323 | 1071060 |       0 | 1280060519 |  750861
(3 rows)

select * from log order by time desc limit 3 OFFSET 1000;
 client_id | tx_no  |  time  | file_no |   epoch    | time_us
-----------+--------+--------+---------+------------+---------
         5 | 262466 | 245953 |       0 | 1280062074 |  513789
         1 | 267519 | 245867 |       0 | 1280062074 |  513301
         7 | 273662 | 245532 |       0 | 1280062078 |  378932
(3 rows)

select * from log order by time desc limit 3 OFFSET 10000;
 client_id | tx_no  | time  | file_no |   epoch    | time_us
-----------+--------+-------+---------+------------+---------
         5 | 123011 | 82854 |       0 | 1280061036 |  743986
         6 | 348967 | 82853 |       0 | 1280062687 |  776317
         8 | 439789 | 82848 |       0 | 1280063109 |  552928
(3 rows)

Then I started pgbench again with the same setting, let it run for a few
minutes and in another console did CHECKPOINT and then turned off power.
After restarting, the database recovered without a problem.

LOG:  database system was interrupted; last known up at 2010-07-25
10:14:15 EDT
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at F/98008610
LOG:  record with zero length at F/A2BAC040
LOG:  redo done at F/A2BAC010
LOG:  last completed transaction was at log time 2010-07-25
10:14:16.151037-04

regards,
Yeb Havinga

pgsql-performance by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Pooling in Core WAS: Need help in performance tuning.
Next
From: Yeb Havinga
Date:
Subject: Re: Testing Sandforce SSD