Re: Benchmark Data requested - Mailing list pgsql-performance
From | Jignesh K. Shah |
---|---|
Subject | Re: Benchmark Data requested |
Date | |
Msg-id | 47A7B292.2040305@sun.com Whole thread Raw |
In response to | Re: Benchmark Data requested (Gregory Stark <stark@enterprisedb.com>) |
List | pgsql-performance |
TPC-H has two runs PowerRun which is single stream (Q1-22 RF1, RF2) And Throughput Runs which has "N" (depends on scale) running simultaneously in a mixed sequence of the same queries and the two update functions. During throughput run you can expect to max out CPU... But commerial databases generally have PowerRuns running quite well even on multi-cores ( Oracle (without RAC have published with 144 cores on Solaris) As for IO system saturating the CPU its two folds Kernel fetching in the data which saturates at some value and in this case PostgreSQL reading the data and putting it in its bufferpool An example of how I use it is as follows: Do a select query on a table such that it results in table scan without actually returning any rows back Now keep throwing hardware (better storage) till it saturates the CPU. That's the practical max you can do with the CPU/OS combination (considering unlimited storage bandwidth). This one is primarily used in guessing how fast one of the queries in TPC-H will complete. In my tests with PostgreSQL, I generally reach the CPU limit without even reaching the storage bandwidth of the underlying storage. Just to give numbers Single 2Gb Fiber Channel port can practically go upto 180 MB/sec Single 4Gb ports have proven to go upto 360-370MB/sec So to saturate a FC port, postgreSQL has to be able to scan 370MB/sec without saturating the CPU. Then comes software stripping which allows multiple ports to be stripped over increasing the capacity of the bandwidth... Now scanning has to be able to drive Nx370MB/sec (all on single core). I had some numbers and I had some limitations based on cpu frequency, blocksize ,etc but those were for 8.1 days or so.. I think to take PostgreSQL a bit high end, we have to first scale out these numbers. Doing some sorts of test in PostgreSQL farms for every release actually does help people see the amount of data that it can drive through... We can actually work on some database operation metrics to also guage how much each release is improving over older releases.. I have ideas for few of them. Regards, Jignesh Gregory Stark wrote: > "Jignesh K. Shah" <J.K.Shah@Sun.COM> writes: > > >> Then for the power run that is essentially running one query at a time should >> essentially be able to utilize the full system (specially multi-core systems), >> unfortunately PostgreSQL can use only one core. (Plus since this is read only >> and there is no separate disk reader all other processes are idle) and system >> is running at 1/Nth capacity (where N is the number of cores/threads) >> > > Is the whole benchmark like this or is this just one part of it? > > Is the i/o system really able to saturate the cpu though? > >
pgsql-performance by date: