Re: A costing analysis tool - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: A costing analysis tool
Date
Msg-id 200510140920.09398.josh@agliodbs.com
Whole thread Raw
In response to A costing analysis tool  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
Kevin,

> It sounds as though you are more focused on picking up costing
> problems which happen during production -- which is clearly
> valuable, but addresses a somewhat different set of needs than
> I was looking at.  That said, it seems like there is potential to share
> signifcant code between the two techniques.  We'll have to see if
> we can work that out.

Hmmmm.  I think we're not communicating yet.  I wanted to get across two 
points:

1) The "cost accuracy statistics collector" should be a *separate* tool from 
the performance test.  This will allow the collector to be used with a 
variety of different test frameworks, increasing its use and our general 
knowledge of costing under different systems.   It will also allow us to use 
pre-built tests which will save time.

2) The collector should be designed in such a way as to allow collection of 
data from production databases, for two reasons:  a)  There are factors which affect cost, like concurrency, system
loadand 
 
swapping, that tend not to occur on test systems.  Ignoring these factors 
will make our cost model fragile and no improvement over the current code.  b)  Far more (like 10x) people in the
communitywould be willing to run a 
 
relatively non-intrusive tool against their production system than would be 
willing to set up a full-blown performance test harness.  The more 
information about the greater variety of systems and application designs we 
collect, the more accurate our cost model is.  Therefore we want a tool which 
can be used by as many people as possible, which means that production 
systems need to be an option.

> I didn't want to broach the subject of the programming language
> for this at the early conceptual stages, but if we're talking about
> code sharing, it can't wait too long, so I'll jump in with it now.  I was
> considering using python to program the tool I was discussing. 

I see nothing wrong with using Python.

> If 
> python is used, I don't care whether there is any change to EXPLAIN
> ANALYZE -- it only takes a few lines of code to pull out what I need
> in the current form.  

Hmmm ... I think you're collecting less data that I would consider necessary 
for full analysis of complex queries.  Could you give an example?

> My concern is whether python is supported on
> all of the target platforms.  

I think python support is broad enough that using it is not an inhibitor.  
We're not talking Haskell, after all.

>  I think I will be significantly more
> productive at this in python than if I used C or perl, but if it's not
> accessible to the PostgreSQL community as a whole, I'll cope.
> Comments, anyone?

There are actually lots of Python people in the commmunity, and Python is easy 
to learn/read if you are experienced with C, Perl, or Ruby.  So I have no 
objections.

> I need to have somewhere for the work to live, and I quite frankly
> would just as soon dodge the overhead of setting up and maintaining
> something, so if noone has objections or other suggestions, I'm
> inclined to take you up on your offer to use your testperf project.
> Does anyone think some other location would be more appropriate?

More appropriate than pgFoundry?  I can't imagine.  You'll need to register as 
a user on pgFoundry, and send me your user name.

> If we get into much more detail, I assume we should take this
> off-list.

Well, once you get going we'll use the testperf-general list, which doesn't 
get much traffic these days.  pgFoundry also supports bug tracking, task 
management, and document sharing, you should check it out.

--Josh

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


pgsql-hackers by date:

Previous
From: "Matt Emmerton"
Date:
Subject: Re: pg_config --pgxs on Win32
Next
From: "Kevin Grittner"
Date:
Subject: Re: A costing analysis tool