Re: Performance - Mailing list pgsql-performance

From Sethu Prasad
Subject Re: Performance
Date
Msg-id BANLkTi=s9kgnXjQ1HeeMNWQu9dBCoin3Fw@mail.gmail.com
Whole thread Raw
In response to Re: Performance  (Greg Smith <greg@2ndquadrant.com>)
List pgsql-performance
Just want to share the DBT(2&5) thing

http://archives.postgresql.org/pgsql-performance/2011-04/msg00145.php
http://sourceforge.net/mailarchive/forum.php?forum_name=osdldbt-general&max_rows=25&style=nested&viewmonth=201104



On Wed, Apr 27, 2011 at 11:55 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Tomas Vondra wrote:
Hmmm, just wondering - what would be needed to build such 'workload
library'? Building it from scratch is not feasible IMHO, but I guess
people could provide their own scripts (as simple as 'set up a a bunch
of tables, fill it with data, run some queries') and there's a pile of
such examples in the pgsql-performance list.
 

The easiest place to start is by re-using the work already done by the TPC for benchmarking commercial databases.  There are ports of the TPC workloads to PostgreSQL available in the DBT-2, DBT-3, and DBT-5 tests; see http://wiki.postgresql.org/wiki/Category:Benchmarking for initial information on those (the page on TPC-H is quite relevant too).  I'd like to see all three of those DBT tests running regularly, as well as two tests it's possible to simulate with pgbench or sysbench:  an in-cache read-only test, and a write as fast as possible test.

The main problem with re-using posts from this list for workload testing is getting an appropriately sized data set for them that stays relevant.  The nature of this sort of benchmark always includes some notion of the size of the database, and you get different results based on how large things are relative to RAM and the database parameters.  That said, some sort of systematic collection of "hard queries" would also be a very useful project for someone to take on.

People show up regularly who want to play with the optimizer in some way.  It's still possible to do that by targeting specific queries you want to accelerate, where it's obvious (or, more likely, hard but still straightforward) how to do better.  But I don't think any of these proposed exercises adjusting the caching model or default optimizer parameters in the database is going anywhere without some sort of benchmarking framework for evaluating the results.  And the TPC tests are a reasonable place to start.  They're a good mixed set of queries, and improving results on those does turn into a real commercial benefit to PostgreSQL in the future too.


--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: Sok Ann Yap
Date:
Subject: Re: reducing random_page_cost from 4 to 2 to force index scan
Next
From: Rishabh Kumar Jain
Date:
Subject: Order of tables