Thread: dbt2 NOTPM numbers

dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

I'm currently playing with dbt2 and am wondering, if the results I'm
getting are reasonable. I'm testing a 2x Dual Core Xeon system with 4 GB
of RAM and 8 SATA HDDs attached via Areca RAID Controller w/ battery
backed write cache. Seven of the eight platters are configured as one
RAID6, one spare drive. That should leave five platters for distributing
read only accesses.

The NOTPM numbers I'm getting are suspiciously low, IMO, but maybe I'm
expecting too much. What do you think, is this reasonable or do I have
to twiddle with the configuration somewhat more?

Regards

Markus


Here are my results:

                          Response Time (s)
  Transaction      %    Average :    90th %        Total
Rollbacks      %
------------  -----  ---------------------  -----------  ---------------
  -----
     Delivery   3.83    549.046 :   595.280         1212
0   0.00
    New Order  45.79    524.659 :   562.016        14494
151   1.05
Order Status   3.98    517.497 :   551.552         1261                0
   0.00
      Payment  42.50    514.562 :   550.383        13452
0   0.00
  Stock Level   3.90    510.610 :   546.957         1236
0   0.00
------------  -----  ---------------------  -----------  ---------------
  -----

238.39 new-order transactions per minute (NOTPM)
59.5 minute duration
0 total unknown errors
529 second(s) ramping up

Re: dbt2 NOTPM numbers

From
Heikki Linnakangas
Date:
Markus Schiltknecht wrote:
> I'm currently playing with dbt2 and am wondering, if the results I'm
> getting are reasonable. I'm testing a 2x Dual Core Xeon system with 4 GB
> of RAM and 8 SATA HDDs attached via Areca RAID Controller w/ battery
> backed write cache. Seven of the eight platters are configured as one
> RAID6, one spare drive. That should leave five platters for distributing
> read only accesses.
>
> The NOTPM numbers I'm getting are suspiciously low, IMO, but maybe I'm
> expecting too much. What do you think, is this reasonable or do I have
> to twiddle with the configuration somewhat more?

There's clearly something wrong. The response times are ridiculously
high, they should be < 5 seconds (except for stock level transaction) to
pass a TPC-C test. I wonder if you built any indexes at all?

The configuration I'm running here has 3 data drives, and I'm getting
reasonable results with ~100 warehouses, at ~1200 noTPM.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

Heikki Linnakangas wrote:
> There's clearly something wrong. The response times are ridiculously
> high, they should be < 5 seconds (except for stock level transaction) to
> pass a TPC-C test. I wonder if you built any indexes at all?

Hm.. according to the output/5/db/plan0.out, all queries use index
scans, so that doesn't seem to be the problem.

> The configuration I'm running here has 3 data drives, and I'm getting
> reasonable results with ~100 warehouses, at ~1200 noTPM.

Thanks, that's exactly the one simple and very raw comparison value I've
been looking for. (Since most of the results pages of (former?) OSDL are
down).

I'll run a bonnie++ first. As the CPUs seem to be idle most of the time
(see the vmstat.out below), I'm suspecting the RAID or disks.

Regards

Markus


procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa
  2  2 329512 1289384      8 2680704    0    0     4     1    1    2  0
  0 100  0
  0 48 329512 1733016      8 2083400    0    0  1789  2265  553 1278  1
  1 63 35
  2 81 329512 492052      8 3194532    0    0  6007 10135 1025 3291  6
1 21 71
  0  8 329512 153360      8 3457936    0    0  6321 11919 1093 4581  7
2 12 79
  0  9 329512 150188      8 3433380    0    0  2083  5078  707 2197  2
1 35 62
  0  6 329512 148412      8 3408748    0    0  1001  2888  526 1203  1
0 34 64
  0 27 329512 152212      8 3379736    0    0  2281  5166  733 2320  3
1 18 79
  0 11 329512 152560      8 3355940    0    0  1837  4028  626 1738  2
1 35 63
  0 14 329512 149268      8 3334912    0    0  1674  3836  630 1619  2
1 31 67
  0  6 329512 152916      8 3311552    0    0  1404  3017  568 1372  1
0 57 41
  0 13 329688 149492      8 3313200    0    0  1687  4178  650 1644  2
1 29 69
  0 84 329688 153480      8 3309684    0    0   812  3790  641 2669  1
1 22 76
  0 18 329688 149232      8 3314032    0    0    87  2147  511 2414  0
0 16 83
  3 20 329688 149196      8 3314648    0    0   756  1854  496 1044  1
0 52 47

Re: dbt2 NOTPM numbers

From
PFC
Date:
> I'll run a bonnie++ first. As the CPUs seem to be idle most of the time
> (see the vmstat.out below), I'm suspecting the RAID or disks.

    You have a huge amount of iowait !
    Did you put the xlog on a separate disk ?
    What filesystem do you use ?
    Did you check that your BBU cache works ?

    For that run a dumb script which does INSERTS in a test table in
autocommit mode ; if you get (7200rpm / 60) = 120 inserts / sec or less,
the good news is that your drives don't lie about fsync, the bad news is
that your BBU cache isn't working...

Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

PFC wrote:
>     You have a huge amount of iowait !

Yup.

>     Did you put the xlog on a separate disk ?

No, it's all one big RAID6 for the sake of simplicity (plus I doubt
somewhat, that 2 disks for WAL + 5 for data + 1 spare would be much
faster than 7 disks for WAL and data + 1 spare - considering that RAID6
needs two parity disks, that's 3 vs 5 disks for data...)

>     What filesystem do you use ?

XFS

>     Did you check that your BBU cache works ?

Thanks to you're hint, yes. I've attached the small python script, in
case it might help someone else, too.

>     For that run a dumb script which does INSERTS in a test table in
> autocommit mode ; if you get (7200rpm / 60) = 120 inserts / sec or less,
> the good news is that your drives don't lie about fsync, the bad news is
> that your BBU cache isn't working...

According to my little script, I constantly get somewhat around 6000
inserts per second, so I guess either my BBU works, or the drives are
lying ;-)   Simplistic troughput testing with dd gives > 200MB/s, which
also seems fine.


Obviously there's something else I'm doing wrong. I didn't really care
much about postgresql.conf, except setting a larger shared_buffers and a
reasonable effective_cache_size.


Oh, something else that's probably worth thinking about (and just came
to my mind again): the XFS is on a lvm2, on that RAID6.


Regards

Markus


Simplistic throughput testing with dd:

dd of=test if=/dev/zero bs=10K count=800000
800000+0 records in
800000+0 records out
8192000000 bytes (8.2 GB) copied, 37.3552 seconds, 219 MB/s
pamonth:/opt/dbt2/bb# dd if=test of=/dev/zero bs=10K count=800000
800000+0 records in
800000+0 records out
8192000000 bytes (8.2 GB) copied, 27.6856 seconds, 296 MB/s

#!/usr/bin/python

import sys, time
import psycopg

count = 500000

db = psycopg.connect("user=postgres dbname=test")
db.autocommit(True)

dbc = db.cursor()
dbc.execute("CREATE TABLE test (data TEXT);")

sys.stdout.flush()
start_t = time.time()
for i in range(count):
        dbc.execute("INSERT INTO test VALUES('insert no. %d');" % i)

diff_t = time.time() - start_t
print "%d inserts in %0.3f seconds, %f inserts/sec" % (count, diff_t, count / diff_t)

dbc.execute("DROP TABLE test;")

Version  1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
pamonth          8G 48173  99 214016  33 93972  20 49244  92 266763  32 615.3   0
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 20510  82 +++++ +++ 25655  98 23954  99 +++++ +++ 25441  99

pamonth,8G,48173,99,214016,33,93972,20,49244,92,266763,32,615.3,0,16,20510,82,+++++,+++,25655,98,23954,99,+++++,+++,25441,99


Re: dbt2 NOTPM numbers

From
Heikki Linnakangas
Date:
Markus Schiltknecht wrote:
> Hi,
>
> Heikki Linnakangas wrote:
>> There's clearly something wrong. The response times are ridiculously
>> high, they should be < 5 seconds (except for stock level transaction)
>> to pass a TPC-C test. I wonder if you built any indexes at all?
>
> Hm.. according to the output/5/db/plan0.out, all queries use index
> scans, so that doesn't seem to be the problem.

I still suspect there's something wrong with plans, I doubt you can get
that bad performance unless it's doing something really stupid. I'd
suggest setting log_min_duration_statement = 5000, and seeing what you
get. Also check pg_stat_user_table.seq_scan just to be extra sure
there's no seq scans.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

Heikki Linnakangas wrote:
> I still suspect there's something wrong with plans, I doubt you can get
> that bad performance unless it's doing something really stupid.

Agreed, but I'm still looking for that really stupid thing...  AFAICT,
there are really no seqscans..., see the pg_stat_user_tables below.

> I'd
> suggest setting log_min_duration_statement = 5000, and seeing what you
> get. Also check pg_stat_user_table.seq_scan just to be extra sure
> there's no seq scans.

I've also added some of the log messages for min_duration_statement
below. Both were taken after two or three test runs.

I'm really wondering, if the RAID 6 of the ARECA 1260 hurts so badly.
That would be disappointing, IMO. I'll try if I can reconfigure it to do
RAID 1+0, and then test again. (Unfortunately the box has already been
shipped to the customer, so that's getting tricky to do via ssh..:-( ).


Regards

Markus


*** pg_stat_user_tables ***

  relid | schemaname |  relname   | seq_scan | seq_tup_read | idx_scan |
idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | last_vacuum |
  last_autovacuum        | last_analyze |       last_autoanalyze

-------+------------+------------+----------+--------------+----------+---------------+-----------+-----------+-----------+-------------+-------------------------------+--------------+-------------------------------
  16390 | public     | district   |        0 |            0 |   206335 |
        206335 |         0 |    100771 |         0 |             |
2007-06-05 15:40:44.39573+02  |              | 2007-06-05 15:39:41.636736+02
  16396 | public     | new_order  |        0 |            0 |    91860 |
      41402317 |     51372 |         0 |     45844 |             |
                          |              |
  16400 | public     | order_line |        0 |            0 |   101195 |
        933197 |    538442 |    436140 |         0 |             |
                          |              |
  16402 | public     | item       |        0 |            0 |   538942 |
        538442 |         0 |         0 |         0 |             |
                          |              |
  16404 | public     | stock      |        0 |            0 |  1093528 |
       1077782 |         0 |    538442 |         0 |             |
                          |              |
  16394 | public     | history    |        0 |            0 |          |
               |     49399 |         0 |         0 |             |
                          |              |
  16388 | public     | warehouse  |        0 |            0 |   150170 |
        150170 |         0 |     49399 |         0 |             |
2007-06-05 15:39:41.059572+02 |              | 2007-06-05 15:38:39.976122+02
  16398 | public     | orders     |        0 |            0 |    96490 |
         96519 |     51372 |     45930 |         0 |             |
                          |              |
  16392 | public     | customer   |        0 |            0 |   233263 |
        599917 |         0 |     95329 |         0 |             |
                          |              |


*** database log snippet ***

2007-06-05 15:42:09 CEST LOG:  duration: 6020.820 ms  statement: SELECT
* FROM order_status(1747, 291, 3, '')
2007-06-05 15:42:09 CEST LOG:  duration: 688.730 ms  statement: SELECT
payment(47, 2, 1533, 47, 2, '', 4295.460000)
2007-06-05 15:42:09 CEST LOG:  duration: 5923.518 ms  statement: SELECT
payment(319, 8, 0, 319, 8, 'OUGHTATIONEING', 2331.470000)
2007-06-05 15:42:09 CEST LOG:  duration: 6370.433 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6463.583 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6358.047 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6114.972 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6193.684 ms  statement: SELECT
payment(96, 10, 0, 96, 10, 'ESEOUGHTBAR', 997.050000)
2007-06-05 15:42:09 CEST LOG:  duration: 6375.163 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6139.613 ms  statement: SELECT
payment(454, 8, 0, 454, 8, 'ANTIOUGHTEING', 1575.110000)
2007-06-05 15:42:09 CEST LOG:  duration: 6336.462 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6420.227 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6447.025 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 15549.277 ms  statement: SELECT
delivery(124, 7)
2007-06-05 15:42:09 CEST LOG:  duration: 1432.199 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 6478.086 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 1405.925 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:09 CEST LOG:  duration: 8399.567 ms  statement: SELECT
delivery(374, 4)
2007-06-05 15:42:10 CEST LOG:  duration: 657.939 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 1159.131 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 840.907 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 616.234 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 1115.098 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 1332.445 ms  statement: SELECT
payment(465, 6, 0, 465, 6, 'ABLEESEESE', 4660.790000)
2007-06-05 15:42:10 CEST LOG:  duration: 855.661 ms  statement: SELECT
payment(267, 6, 0, 267, 6, 'OUGHTEINGOUGHT', 4214.080000)
2007-06-05 15:42:10 CEST LOG:  duration: 580.983 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 883.528 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 7757.581 ms  statement: SELECT
delivery(126, 6)
2007-06-05 15:42:10 CEST LOG:  duration: 537.642 ms  statement: SELECT
payment(493, 2, 0, 493, 2, 'BARBARANTI', 2881.500000)
2007-06-05 15:42:10 CEST LOG:  duration: 1035.529 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:10 CEST LOG:  duration: 1007.521 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:11 CEST LOG:  duration: 1088.356 ms  statement: FETCH
ALL IN mycursor
2007-06-05 15:42:11 CEST LOG:  duration: 1749.507 ms  statement: SELECT
delivery(442, 5)



Re: dbt2 NOTPM numbers

From
Heikki Linnakangas
Date:
Markus Schiltknecht wrote:
> Hi,
>
> Heikki Linnakangas wrote:
>> I still suspect there's something wrong with plans, I doubt you can
>> get that bad performance unless it's doing something really stupid.
>
> Agreed, but I'm still looking for that really stupid thing...  AFAICT,
> there are really no seqscans..., see the pg_stat_user_tables below.
>
>> I'd suggest setting log_min_duration_statement = 5000, and seeing what
>> you get. Also check pg_stat_user_table.seq_scan just to be extra sure
>> there's no seq scans.
>
> I've also added some of the log messages for min_duration_statement
> below. Both were taken after two or three test runs.
>
> I'm really wondering, if the RAID 6 of the ARECA 1260 hurts so badly.
> That would be disappointing, IMO. I'll try if I can reconfigure it to do
> RAID 1+0, and then test again. (Unfortunately the box has already been
> shipped to the customer, so that's getting tricky to do via ssh..:-( ).

Maybe, TPC-C is very write-intensive. I don't know much about RAID
stuff, but I think you'd really benefit from a separate WAL drive. You
could try turning fsync=off to see if that makes a difference.

Oh, and how many connections are you using? DBT-2 can be quite sensitive
to that. 30 seems to work pretty well for me.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

Heikki Linnakangas wrote:
> Maybe, TPC-C is very write-intensive. I don't know much about RAID
> stuff, but I think you'd really benefit from a separate WAL drive. You
> could try turning fsync=off to see if that makes a difference.

Hm.. good idea, I'll try that.

> Oh, and how many connections are you using? DBT-2 can be quite sensitive
> to that. 30 seems to work pretty well for me.

I've been using between 2 and 90, but that made pretty much no
difference at all. I'm not getting anything more that some 300 NOTPM.

Regards

Markus

Re: dbt2 NOTPM numbers

From
Greg Smith
Date:
On Tue, 5 Jun 2007, Markus Schiltknecht wrote:

> I'm really wondering, if the RAID 6 of the ARECA 1260 hurts so badly

All of your disk performance tests look reasonable; certainly not slow
enough to cause the issue you're seeing.  The only thing I've seen in this
thread that makes me slightly suspicious is:

> Oh, something else that's probably worth thinking about (and just came
> to my mind again): the XFS is on a lvm2, on that RAID6.

There have been reports here from reliable sources that lvm can introduce
performance issues;
http://archives.postgresql.org/pgsql-performance/2006-07/msg00276.php is
one example.  Your dd test results suggest you're getting decent thoughput
though, so you don't seem to be suffering too much from that.

Anyway, did you post your postgresql.conf yet?  I don't remember seeing
it.  From looking at your latest data, my first guess would be there's
something wrong there that's either causing a) the system to be
checkpointing constantly, or b) not enough memory allocated to allow
queries to execute properly.  The fact that all your long statement times
come from SELECTs suggests to me that playing with the WAL parameters like
fsync isn't likely to help here.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: dbt2 NOTPM numbers

From
"Mark Wong"
Date:
On 6/4/07, Markus Schiltknecht <markus@bluegap.ch> wrote:
> Thanks, that's exactly the one simple and very raw comparison value I've
> been looking for. (Since most of the results pages of (former?) OSDL are
> down).

Yeah, those results pages are gone for good. :(

Regards,
Mark

Re: dbt2 NOTPM numbers

From
Jim Nasby
Date:
On Jun 4, 2007, at 1:56 PM, Markus Schiltknecht wrote:
> Simplistic throughput testing with dd:
>
> dd of=test if=/dev/zero bs=10K count=800000
> 800000+0 records in
> 800000+0 records out
> 8192000000 bytes (8.2 GB) copied, 37.3552 seconds, 219 MB/s
> pamonth:/opt/dbt2/bb# dd if=test of=/dev/zero bs=10K count=800000
> 800000+0 records in
> 800000+0 records out
> 8192000000 bytes (8.2 GB) copied, 27.6856 seconds, 296 MB/s

I don't think that kind of testing is useful for good raid
controllers on RAID5/6, because the controller will just be streaming
the data out; it'll compute the parity blocks on the fly and just
stream data to the drives as fast as possible.

But that's not how writes in the database work (except for WAL);
you're writing stuff all over the place, none of which is streamed.
So in the best case (the entire stripe being updated is in the
controller's cache), at a minimum it's going to have to write data +
parity ( * 2 for RAID 6, IIRC) for every write. But any real-sized
database is going to be far larger than your raid cache, which means
there's a good chance a block being written will no longer have it's
stripe in cache. In that case, the controller is going to have to
read a bunch of data back off the drive, which is going to clobber
performance.

Now, add that performance bottleneck on top of your WAL writes and
you're in real trouble.

BTW, I was thinking in terms of stripe size when I wrote this, but I
don't know if good controllers actually need to deal with things at a
stripe level, or if they can deal with smaller chunks of a stripe. In
either case, the issue is still the number of extra reads going on.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

Jim Nasby wrote:
> I don't think that kind of testing is useful for good raid controllers
> on RAID5/6, because the controller will just be streaming the data out;
> it'll compute the parity blocks on the fly and just stream data to the
> drives as fast as possible.

That's why I called it 'simplistic throughput testing'...

> But that's not how writes in the database work (except for WAL); you're
> writing stuff all over the place, none of which is streamed. So in the
> best case (the entire stripe being updated is in the controller's
> cache), at a minimum it's going to have to write data + parity ( * 2 for
> RAID 6, IIRC) for every write. But any real-sized database is going to
> be far larger than your raid cache, which means there's a good chance a
> block being written will no longer have it's stripe in cache. In that
> case, the controller is going to have to read a bunch of data back off
> the drive, which is going to clobber performance.

I'm well aware. Our workload (hopefully) consists of a much lower
writes/reads ratio than dbt2, so RAID 6 might work anyway.

> Now, add that performance bottleneck on top of your WAL writes and
> you're in real trouble.

Well, I'm basically surprised of the low NOTPM numbers compared to my
desktop system, which also does around 200 NOTPMs, with only two
platters in RAID 1 config... How can a server with four Cores and 8
Platters be equaly slow?

Anyway, I've now reconfigured the system with RAID 1+0 and got more than
twice the NOTPMs:

                          Response Time (s)
  Transaction      %    Average :    90th %        Total
Rollbacks      %
------------  -----  ---------------------  -----------  ---------------
  -----
     Delivery   3.84    204.733 :   241.998          704
0   0.00
    New Order  45.77    203.651 :   242.847         8382
75   0.90
Order Status   4.32    199.184 :   238.081          792                0
   0.00
      Payment  42.02    198.969 :   236.549         7695
0   0.00
  Stock Level   4.04    198.668 :   236.113          740
0   0.00
------------  -----  ---------------------  -----------  ---------------
  -----

567.72 new-order transactions per minute (NOTPM)
14.5 minute duration
0 total unknown errors
529 second(s) ramping up

I'm still feeling that 550 is pretty low. The response times are beyond
good and evil.

As vmstat.out tells us, the CPUs are still pretty much idle or waiting
most of the time.

procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
id wa
  0  3 494560 2181964      8 1787680   13   15   317   242  140    2  4
  1 72 23
  0  9 494560 1558892      8 2298348    0    0  2973  2018  584 1114  2
  1 76 21
  1 14 494496 424116      8 3316000    2    0  5613  9293  935 2943  5
1 29 65
  0 15 452840 150148      8 3487160  738    3  5662  8709  925 3444  5
2 21 73
  0 11 439172 151052      8 3386556  263    0  5690  8293  969 4145  5
2 23 70
  0 17 438996 149748      8 3308184   57    6  5036  7174  902 4104  5
2 25 69
  1 25 439940 150344      8 3228304    9   28  4757  7479  922 4269  5
2 26 67

For everybody interested, these settings are different from Pg 8.2
default postgresql.conf:

listen_addresses = '*'
port = 54321
shared_buffers = 2048MB
work_mem = 10MB
maintenance_work_mem = 64MB
#max_stack_depth = 4MB
max_fsm_pages = 409600
eachcheckpoint_segments = 6
checkpoint_timeout = 1h
effective_cache_size = 3800MB
log_min_duration_statement = 500


For dbt2, I've used 500 warehouses and 90 concurrent connections,
default values for everything else.

Do I simply have to put more RAM (currently 4GB) in that machine? Or
what else can be wrong?

Is anybody else seeing low performance with the Areca SATA Controllers?
(in my case: "Areca Technology Corp. ARC-1260 16-Port PCI-Express to
SATA RAID Controller", according to lspci)


Then again, maybe I'm just expecting too much...


Regards

Markus


Re: dbt2 NOTPM numbers

From
Heikki Linnakangas
Date:
Markus Schiltknecht wrote:
> For dbt2, I've used 500 warehouses and 90 concurrent connections,
> default values for everything else.

500? That's just too much for the hardware. Start from say 70 warehouses
and up it from there 10 at a time until you hit the wall. I'm using 30
connections with ~100 warehouses on somewhat similar hardware.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Heikki Linnakangas wrote:
> Markus Schiltknecht wrote:
>> For dbt2, I've used 500 warehouses and 90 concurrent connections,
>> default values for everything else.
>
> 500? That's just too much for the hardware. Start from say 70 warehouses
> and up it from there 10 at a time until you hit the wall. I'm using 30
> connections with ~100 warehouses on somewhat similar hardware.

Aha! That's why...  I've seen the '500' in some dbt2 samples and thought
it'd be a good default value.

But it makes sense that the benchmark doesn't automatically 'scale down'...

Stupid me.

Thanks again! Hoping for larger NOTPMs.

Regards

Markus



Re: dbt2 NOTPM numbers

From
"Mark Wong"
Date:
On 6/11/07, Markus Schiltknecht <markus@bluegap.ch> wrote:
> Heikki Linnakangas wrote:
> > Markus Schiltknecht wrote:
> >> For dbt2, I've used 500 warehouses and 90 concurrent connections,
> >> default values for everything else.
> >
> > 500? That's just too much for the hardware. Start from say 70 warehouses
> > and up it from there 10 at a time until you hit the wall. I'm using 30
> > connections with ~100 warehouses on somewhat similar hardware.
>
> Aha! That's why...  I've seen the '500' in some dbt2 samples and thought
> it'd be a good default value.
>
> But it makes sense that the benchmark doesn't automatically 'scale down'...
>
> Stupid me.
>
> Thanks again! Hoping for larger NOTPMs.

Yeah, I ran with 500+ warehouses, but I had 6 14-disk arrays of 15K
RPM scsi drives and 6 dual-channel controllers... :)

Regards,
Mark

Re: dbt2 NOTPM numbers

From
Markus Schiltknecht
Date:
Hi,

Mark Wong wrote:
> Yeah, I ran with 500+ warehouses, but I had 6 14-disk arrays of 15K
> RPM scsi drives and 6 dual-channel controllers... :)

Lucky you!

In the mean time, I've figured out that the box in question peaked at
about 1450 NOTPMs with 120 warehouses with RAID 1+0. I'll try to compare
again to RAID 6.

Is there any place where such results are collected?

Regards

Markus

Re: dbt2 NOTPM numbers

From
"Mark Wong"
Date:
On 6/13/07, Markus Schiltknecht <markus@bluegap.ch> wrote:
> Hi,
>
> Mark Wong wrote:
> > Yeah, I ran with 500+ warehouses, but I had 6 14-disk arrays of 15K
> > RPM scsi drives and 6 dual-channel controllers... :)
>
> Lucky you!
>
> In the mean time, I've figured out that the box in question peaked at
> about 1450 NOTPMs with 120 warehouses with RAID 1+0. I'll try to compare
> again to RAID 6.
>
> Is there any place where such results are collected?

Unfortunately not anymore.  When I was working at OSDL there was...
I've been told that the lab has been mostly disassembled now so the
data are lost now.

Regards,
Mark

Re: dbt2 NOTPM numbers

From
Jim Nasby
Date:
On Jun 13, 2007, at 11:43 AM, Markus Schiltknecht wrote:
> In the mean time, I've figured out that the box in question peaked
> at about 1450 NOTPMs with 120 warehouses with RAID 1+0. I'll try to
> compare again to RAID 6.
>
> Is there any place where such results are collected?

There is the ill-used -benchmarks list, but perhaps it would be
better if we setup a wiki for this...
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)