Re: H800 + md1200 Performance problem - Mailing list pgsql-performance

From Cesar Martin
Subject Re: H800 + md1200 Performance problem
Date
Msg-id CAMAsR=7Xkzcr-G_NO-Jn2yTa+BSayuMMK7Gv7diXDNcC3WNVUw@mail.gmail.com
Whole thread Raw
In response to Re: H800 + md1200 Performance problem  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: H800 + md1200 Performance problem
Re: H800 + md1200 Performance problem
List pgsql-performance
Raid controller issue or driver problem was the first problem that I studied.
I installed Centos 5.4 al the beginning, but I had performance problems, and I contacted Dell support... but Centos is not support by Dell... Then I installed Redhat 6 and we contact Dell with same problem.
Dell say that all is right and that this is a software problem.
I have installed Centos 5.4, 6.2 and Redhat 6 with similar result, I think that not is driver problem (megasas-raid kernel module).
I will check kernel updates...
Thanks!

PS. lately I'm pretty disappointed with the quality of the DELL components, is not the first problem we have with hardware in new machines.

El 4 de abril de 2012 19:16, Merlin Moncure <mmoncure@gmail.com> escribió:
On Wed, Apr 4, 2012 at 4:42 AM, Cesar Martin <cmartinp@gmail.com> wrote:
> Hello,
>
> Yesterday I changed the kernel setting, that said
> Scott, vm.zone_reclaim_mode = 0. I have done new benchmarks and I have
> noticed changes at least in Postgres:
>
> First exec:
> EXPLAIN ANALYZE SELECT * from company_news_internet_201111;
>                                                                  QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on company_news_internet_201111  (cost=0.00..369577.79
> rows=6765779 width=323) (actual time=0.020..7984.707 rows=6765779 loops=1)
>  Total runtime: 12699.008 ms
> (2 filas)
>
> Second:
> EXPLAIN ANALYZE SELECT * from company_news_internet_201111;
>                                                                  QUERY PLAN
>
> --------------------------------------------------------------------------------------------------------------------------------------------
>  Seq Scan on company_news_internet_201111  (cost=0.00..369577.79
> rows=6765779 width=323) (actual time=0.023..1767.440 rows=6765779 loops=1)
>  Total runtime: 2696.901 ms
>
> It seems that now data is being cached right...
>
> The large query in first exec takes 80 seconds and in second exec takes
> around 23 seconds. This is not spectacular but is better than yesterday.
>
> Furthermore the results of dd are strange:
>
> dd if=/dev/zero of=/vol02/bonnie/DD bs=8M count=16384
> 16384+0 records in
> 16384+0 records out
> 137438953472 bytes (137 GB) copied, 803,738 s, 171 MB/s
>
> 171 MB/s I think is bad value for 12 SAS RAID10... And when I execute iostat
> during the dd execution i obtain results like:
> sdc            1514,62         0,01       108,58         11     117765
> sdc            3705,50         0,01       316,62          0        633
> sdc               2,00         0,00         0,05          0          0
> sdc             920,00         0,00        63,49          0        126
> sdc            8322,50         0,03       712,00          0       1424
> sdc            6662,50         0,02       568,53          0       1137
> sdc               0,00         0,00         0,00          0          0
> sdc               1,50         0,00         0,04          0          0
> sdc            6413,00         0,01       412,28          0        824
> sdc           13107,50         0,03       867,94          0       1735
> sdc               0,00         0,00         0,00          0          0
> sdc               1,50         0,00         0,03          0          0
> sdc            9719,00         0,03       815,49          0       1630
> sdc            2817,50         0,01       272,51          0        545
> sdc               1,50         0,00         0,05          0          0
> sdc            1181,00         0,00        71,49          0        142
> sdc            7225,00         0,01       362,56          0        725
> sdc            2973,50         0,01       269,97          0        539
>
> I don't understand why MB_wrtn/s go from 0 to near 800MB/s constantly during
> execution.

This is looking more and more like a a raid controller issue. ISTM
it's bucking the cache, filling it up and flushing it synchronously.
your read results are ok but not what they should be IMO.  Maybe it's
an environmental issue or the card is just a straight up lemon (no
surprise in the dell line).  Are you using standard drivers, and have
you checked for updates?  Have you considered contacting dell support?

merlin



--
César Martín Pérez
cmartinp@gmail.com

pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: H800 + md1200 Performance problem
Next
From: Scott Marlowe
Date:
Subject: Re: H800 + md1200 Performance problem