Re: A costing analysis tool - Mailing list pgsql-hackers
From | Josh Berkus |
---|---|
Subject | Re: A costing analysis tool |
Date | |
Msg-id | 200510140920.09398.josh@agliodbs.com Whole thread Raw |
In response to | A costing analysis tool ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
List | pgsql-hackers |
Kevin, > It sounds as though you are more focused on picking up costing > problems which happen during production -- which is clearly > valuable, but addresses a somewhat different set of needs than > I was looking at. That said, it seems like there is potential to share > signifcant code between the two techniques. We'll have to see if > we can work that out. Hmmmm. I think we're not communicating yet. I wanted to get across two points: 1) The "cost accuracy statistics collector" should be a *separate* tool from the performance test. This will allow the collector to be used with a variety of different test frameworks, increasing its use and our general knowledge of costing under different systems. It will also allow us to use pre-built tests which will save time. 2) The collector should be designed in such a way as to allow collection of data from production databases, for two reasons: a) There are factors which affect cost, like concurrency, system loadand swapping, that tend not to occur on test systems. Ignoring these factors will make our cost model fragile and no improvement over the current code. b) Far more (like 10x) people in the communitywould be willing to run a relatively non-intrusive tool against their production system than would be willing to set up a full-blown performance test harness. The more information about the greater variety of systems and application designs we collect, the more accurate our cost model is. Therefore we want a tool which can be used by as many people as possible, which means that production systems need to be an option. > I didn't want to broach the subject of the programming language > for this at the early conceptual stages, but if we're talking about > code sharing, it can't wait too long, so I'll jump in with it now. I was > considering using python to program the tool I was discussing. I see nothing wrong with using Python. > If > python is used, I don't care whether there is any change to EXPLAIN > ANALYZE -- it only takes a few lines of code to pull out what I need > in the current form. Hmmm ... I think you're collecting less data that I would consider necessary for full analysis of complex queries. Could you give an example? > My concern is whether python is supported on > all of the target platforms. I think python support is broad enough that using it is not an inhibitor. We're not talking Haskell, after all. > I think I will be significantly more > productive at this in python than if I used C or perl, but if it's not > accessible to the PostgreSQL community as a whole, I'll cope. > Comments, anyone? There are actually lots of Python people in the commmunity, and Python is easy to learn/read if you are experienced with C, Perl, or Ruby. So I have no objections. > I need to have somewhere for the work to live, and I quite frankly > would just as soon dodge the overhead of setting up and maintaining > something, so if noone has objections or other suggestions, I'm > inclined to take you up on your offer to use your testperf project. > Does anyone think some other location would be more appropriate? More appropriate than pgFoundry? I can't imagine. You'll need to register as a user on pgFoundry, and send me your user name. > If we get into much more detail, I assume we should take this > off-list. Well, once you get going we'll use the testperf-general list, which doesn't get much traffic these days. pgFoundry also supports bug tracking, task management, and document sharing, you should check it out. --Josh -- Josh Berkus Aglio Database Solutions San Francisco
pgsql-hackers by date: