Thread: Project suggestion: benchmark utility for PostgreSQL

Project suggestion: benchmark utility for PostgreSQL

From
"Mickael DELOISON"
Date:
Hello everybody.

I am a student and I would like to have your opinion on a project I
plan to submit for GSoC. At school, when I work with relational
databases I have problems to test tables' structure and queries
because I need to insert test data manually, which is very unpleasant.
Therefore, I suggest creating a benchmark utility for PostgreSQL.
This utility would be divided in two parts: first part is the **random
data generator** to fill in the tables with test data, and second part
is a **test module** in which we can execute a set of queries and
analyze response time and results.

The random data generator would be able to generate data of every type
(from 8.1 Numeric Types to 8.10 Arrays in official documentation).
When starting, the program connects to a database, retrieves the
tables and proposes adding random data to the user. The user uses
pattern and dictionaries to make those generators work.

The test module would consist in a sort of unit tests module. The user
adds queries and possibly sets the expected results of those queries.
The module executes the queries and gives for each one the results,
the time it took to execute, and if the results are those which were
expected.

And finally, I did not talk about that before. But I think it would be
a great idea for a complete benchmark utility... if it would give some
advices to optimize the tables' structure (columns' type, indexes,
foreign keys). Of course, those advices would be for a novice user as
expert already knows how to optimize the columns types, the joins...

For a programming language, as it would be for GSoC, it has to be
realized in three month and I believe the utility has to be
cross-platform (anyway I want it to be). So I think Java would be
good. I am very used to Java and Swing programming. What do you think
about that choice? If you feel Java is a bad choice, there is
C++/Boost/wxWidget/ (like pgAdmin). But with wxWidget, I am not sure
if a GUI works under Windows and Linux it will work under MacOS
without hacks.

Thank you for having read me until there. I am very excited about such
project. I wait for your comments.

Best regards,
Mickael Deloison


Re: Project suggestion: benchmark utility for PostgreSQL

From
"Florian G. Pflug"
Date:
Mickael DELOISON wrote:
> For a programming language, as it would be for GSoC, it has to be
> realized in three month and I believe the utility has to be
> cross-platform (anyway I want it to be). So I think Java would be
> good. I am very used to Java and Swing programming. What do you think
> about that choice? If you feel Java is a bad choice, there is
> C++/Boost/wxWidget/ (like pgAdmin). But with wxWidget, I am not sure
> if a GUI works under Windows and Linux it will work under MacOS
> without hacks.

I just wanted to comment on the wxWidgest-under-OSX part. I works
pretty well, but of course there are some bugs in wxWidgets that
plague only a particular plattform. And since OSX seems to be one
of the lesser used one, there appear to be more bugs of this kind.

But if you grep through the pgAdmin3 sources, you'll find only
little OSX-specific code.

One nice thing about wxWidgets is that applications that use if
feel like native applications on the platforms that wxWidgets
supports. Java-GUIs often feel rather alien, at least IMHO.

greetings, Florian Pflug



Re: Project suggestion: benchmark utility for PostgreSQL

From
Josh Berkus
Date:
Mickael,

> I am a student and I would like to have your opinion on a project I
> plan to submit for GSoC. At school, when I work with relational
> databases I have problems to test tables' structure and queries
> because I need to insert test data manually, which is very unpleasant.
> Therefore, I suggest creating a benchmark utility for PostgreSQL.

I think your project sounds really cool, but also not doable in 3 months 
from scratch.  You need to build on the work of others.  Simply 
designing a viable benchmark schema and data set would be a 
more-than-3-month process; developing them *and* the tools to use them 
would likely take you more than a year.  I know whereof I speak.

Therefore, I think you should attach your proposal to one of the 
following projects:

PGBuildfarm: orient your tools more towards being "performance unit 
tests" than part of a benchmark.  Your tool could then become an 
additional component of the Buildfarm, and your mentor would be Andrew 
Dunstan.  Note that this would make any GUI components the last thing 
you do.

TPC-E/DBT5: you could work with Rilson on modularizing DBT5 so that 
users could run a smaller version and do "unit tests" of parts of the 
TPCE schema/queryset.  In that case, your mentor would be Mark Wong.

OpenJPA: the JPA project is working on creating OSDB+Java performance 
unit tests as well (database-agnostic).  I know they could use help; if 
you did this, I'd recuit a JPA person to be your mentor.

EAStress: You could use Spec's recently liberalized rules to build your 
tools on top of the EAstress workload.  This would have a couple 
disadvantages, though: EAstress doesn't use database features much, and 
the workload isn't open source, just free.  In that case, your mentor 
would be me.

> For a programming language, as it would be for GSoC, it has to be
> realized in three month and I believe the utility has to be
> cross-platform (anyway I want it to be). So I think Java would be
> good. I am very used to Java and Swing programming. What do you think
> about that choice? If you feel Java is a bad choice, there is
> C++/Boost/wxWidget/ (like pgAdmin). But with wxWidget, I am not sure
> if a GUI works under Windows and Linux it will work under MacOS
> without hacks.

I don't see any issue with using Java.   As SoC administrator, my main 
concern is that you finish a usable tool, so I'd go with whatever you 
can code the best in.

Anyway, it sounds like a really cool project, and I look forward to your 
application.

--Josh Berkus