Re: understanding postgres issues/bottlenecks - Mailing list pgsql-performance
From | Scott Marlowe |
---|---|
Subject | Re: understanding postgres issues/bottlenecks |
Date | |
Msg-id | dcc563d10901071406j345ce3e5v3399aaf652213ef6@mail.gmail.com Whole thread Raw |
In response to | Re: understanding postgres issues/bottlenecks ("Scott Marlowe" <scott.marlowe@gmail.com>) |
Responses |
Re: understanding postgres issues/bottlenecks
Re: understanding postgres issues/bottlenecks |
List | pgsql-performance |
Just to elaborate on the horror that is a Dell perc5e. We have one in a 1950 with battery backed cache (256 Meg I think). It has an 8 disk 500Gig SATA drive RAID-10 array and 4 1.6GHz cpus and 10 Gigs ram. This server currently serves as a mnogo search server. Here's what vmstat 1 looks like during the day: procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 1 5 38748 60400 43172 9227632 0 0 5676 0 668 703 4 2 12 82 0 5 38748 58888 43176 9229360 0 0 5548 0 672 792 2 0 15 83 0 4 38748 64460 43184 9230476 0 0 5964 72 773 947 1 0 31 67 0 5 38748 61884 43272 9241564 0 0 5896 1112 674 1028 1 2 23 74 0 5 38748 56612 43276 9247376 0 0 5660 0 603 795 0 0 21 79 0 5 38748 56472 43268 9247480 0 0 5700 0 603 790 0 0 22 77 Note 4 or 5 blocking, and reading in data at 5M/sec and bursting small writes. 75 to 80% wait state user and sys time around 1 or 2% rest is idle. This is without adding any load from pgbench. When I add pgbench, the numbers from vmstat look the same pretty much, slight increase of maybe 20% bi and bo and a rise in the blocked processes. Running vacuum on the pgbench db on this machine takes well over a minute. Even a small pgbench test runs slowly, getting 20 to 30 tps during the day. During off hours I can get a max of about 80 tps. ---------------------------------------------------------------------------------------------- Now, here's my primary production server. It is NOT a Dell, or an HP, or an IBM. It is a high end white box machine with a Tyan mobo. Primary difference here is 12 disk SAS RAID-10 and a much faster RAID controller. It's got 8 opteron 2.1GHz cores but honestly, it hardly uses any of them. This is it before I run pgbench. It almost looks like it's sleeping. It is handling about 250 queries per second right now, most serving from memory. procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 2464 614156 626448 28658476 0 0 0 688 2515 2615 12 1 87 0 0 2 0 2464 638972 626448 28658484 0 0 8 656 2401 2454 16 2 82 0 0 1 0 2464 631852 626448 28658516 0 0 0 552 1939 1984 14 1 85 0 0 4 0 2464 617900 626448 28658532 0 0 32 500 4925 5276 19 2 78 0 0 1 0 2464 617656 626448 28658560 0 0 0 492 3363 3428 14 1 85 0 0 1 0 2464 618648 626448 28658560 0 0 0 752 3391 3579 12 2 85 0 0 It's not reading anything in because it fits in memory (it's got 32 Gig ram) and it's writing out a small amount. I'll run pgbench to give it a heavier load and force more writes. pgbench -c 10 -t 25000 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 100 number of clients: 10 number of transactions per client: 25000 number of transactions actually processed: 250000/250000 tps = 2866.549792 (including connections establishing) tps = 2868.010947 (excluding connections establishing) Here's vmstat during that period: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 1 2 2408 190548 626584 28770448 0 0 0 66200 8210 84391 44 14 35 6 0 0 2 2408 207156 626584 28770900 0 0 0 23832 6688 6426 20 4 58 18 0 7 0 2408 295792 626588 28771904 0 0 0 34372 6831 62873 35 12 43 11 0 3 2 2408 308904 626588 28773128 0 0 0 60040 6711 78235 44 13 40 3 0 4 0 2408 310408 626588 28773660 0 0 0 54780 7779 37399 28 8 50 15 0 6 1 2408 325808 626592 28775912 0 0 16 43588 5345 105348 43 15 39 3 0 10 0 2408 324304 626592 28778188 0 0 16 60984 6582 125105 52 18 29 1 0 8 1 2408 339204 626596 28780248 0 0 8 47956 5203 113799 48 18 33 1 0 4 2 2408 337856 626596 28782840 0 0 0 108096 12132 90391 46 16 33 4 0 Note that wait is generally around 5 to 10% on this machine. Now, I know it's a much more powerful machine, but the difference in performance is not some percentage faster. It's many many factors faster. Twenty or more times as faster as the dell. That dell machine cost us somewhere in the range of $11,000. The MD-1000 was a large part of that cost. The Aberdeen white box that is replacing it cost us about $11,500. I cannot understand how Dell stays in business. It certainly isn't on merit.
pgsql-performance by date: