Re: A costing analysis tool - Mailing list pgsql-hackers
From | Kevin Grittner |
---|---|
Subject | Re: A costing analysis tool |
Date | |
Msg-id | s34fab58.030@gwmta.wicourts.gov Whole thread Raw |
In response to | A costing analysis tool ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>) |
Responses |
Re: A costing analysis tool
|
List | pgsql-hackers |
I think I get your point now. If I understand it, you could accomplish what you want under my rough ("exists only in my head so far") design by creating your own test cases and putting together a script to run just those. I would be exremely leary of comparing tests against a database under load with tests run against an otherwise idle system -- you would expect the time per cost unit to increase -- but as long as the load remained relatively constant, you could get useful information from comparing the various tests from that system under that load. Are we tracking now? I propose capturing only three values from the output of explain analyze, and saving it with many columns of context information. Below is a sample script to capture the three columns and display them with the derived ratio and two very limited bits of context. Don't worry about the lack of proper context just yet, we're talking about what I propose to capture from the output files. Here's the script: ------------------------------------------------------------ # simplistic example of parsing through PostgreSQL analyze verbose output import sys def printHuman(filename, lineNum, operation, cost, actualTime):print "%-60s %3s %-20s %10.2f %12.3f %f" \% (filename,lineNum, operation, float(cost), float(actualTime), float(actualTime) / float(cost)) for inputfilename in sys.argv[1:]:inputfile = open(inputfilename, 'r')inputlines = inputfile.readlines()inputfile.close()forindx in range(len(inputlines)): line = inputlines[indx].lstrip() if line[:2]== '->': opName = line[3:line.find('(')].strip() tailstripwords = (' on ', ' using ') for wordin tailstripwords: onpos = opName.find(word) if onpos > -1: opName = opName[:onpos] dotspos = line.find('..') cost = line[dotspos + 2: line.find(' ', dotspos)] nextdotspos= line.find('..', line.find('actual time')) actTime = line[nextdotspos + 2: line.find(' ', nextdotspos)] printHuman(inputfilename, indx, opName, cost, actTime) ------------------------------------------------------------ and here's the output based on the plans shown in an earlier thread ( http://archives.postgresql.org/pgsql-hackers/2005-10/msg00303.php ): ------------------------------------------------------------ [kgrittn@WCS45792 kgrittn]$ python parse_plan.py *.eplain.analyze milw-cal-daterange-bare-cached.eplain.analyze 1 Merge Join 176218.09 4500.785 0.025541 milw-cal-daterange-bare-cached.eplain.analyze 3 Sort 48975.72 812.160 0.016583 milw-cal-daterange-bare-cached.eplain.analyze 5 Bitmap Heap Scan 47081.47 427.463 0.009079 milw-cal-daterange-bare-cached.eplain.analyze 7 Bitmap Index Scan 297.07 19.089 0.064258 milw-cal-daterange-bare-cached.eplain.analyze 9 Sort 126945.94 3124.966 0.024617 milw-cal-daterange-bare-cached.eplain.analyze 11 Bitmap Heap Scan 119071.84 1195.302 0.010038 milw-cal-daterange-bare-cached.eplain.analyze 13 Bitmap Index Scan 1037.84 54.917 0.052915 milw-cal-daterange-bare-noncached.eplain.analyze 1 Merge Join 176218.09 32071.776 0.182000 milw-cal-daterange-bare-noncached.eplain.analyze 3 Sort 48975.72 5954.706 0.121585 milw-cal-daterange-bare-noncached.eplain.analyze 5 Bitmap Heap Scan 47081.47 5560.895 0.118112 milw-cal-daterange-bare-noncached.eplain.analyze 7 Bitmap Index Scan 297.07 52.675 0.177315 milw-cal-daterange-bare-noncached.eplain.analyze 9 Sort 126945.94 25551.291 0.201277 milw-cal-daterange-bare-noncached.eplain.analyze 11 Bitmap Heap Scan 119071.84 23588.122 0.198100 milw-cal-daterange-bare-noncached.eplain.analyze 13 Bitmap Index Scan 1037.84 146.412 0.141074 milw-cal-daterange-nomergejoin-cached.eplain.analyze 1 Nested Loop 298740.94 1464.720 0.004903 milw-cal-daterange-nomergejoin-cached.eplain.analyze 2 Index Scan 49419.45 412.268 0.008342 milw-cal-daterange-nomergejoin-cached.eplain.analyze 4 Index Scan 9.93 0.019 0.001913 milw-cal-daterange-nomergejoin-noncached.eplain.analyze 1 Nested Loop 298740.94 14277.124 0.047791 milw-cal-daterange-nomergejoin-noncached.eplain.analyze 2 Index Scan 49419.45 11510.723 0.232919 milw-cal-daterange-nomergejoin-noncached.eplain.analyze 4 Index Scan 9.93 0.062 0.006244 ------------------------------------------------------------ Besides the additional context info, I expect to be storing the log of the ratio, since it seems to make more sense to average and look for outliers based on that than the raw ratio. Is there anything else you think should be captured? (Keep in mind that I'm trying to keep the scope of this very focused, so it can actually get done, but I'm not adverse to capturing something else that's just sitting there if it's useful to someone.) -Kevin >>> Josh Berkus <josh@agliodbs.com> 10/14/05 11:20 AM >>> Kevin, > It sounds as though you are more focused on picking up costing > problems which happen during production -- which is clearly > valuable, but addresses a somewhat different set of needs than > I was looking at. That said, it seems like there is potential to share > signifcant code between the two techniques. We'll have to see if > we can work that out. Hmmmm. I think we're not communicating yet. I wanted to get across two points: 1) The "cost accuracy statistics collector" should be a *separate* tool from the performance test. This will allow the collector to be used with a variety of different test frameworks, increasing its use and our general knowledge of costing under different systems. It will also allow us to use pre-built tests which will save time. 2) The collector should be designed in such a way as to allow collection of data from production databases, for two reasons: a) There are factors which affect cost, like concurrency, system loadand swapping, that tend not to occur on test systems. Ignoring these factors will make our cost model fragile and no improvement over the current code. b) Far more (like 10x) people in the communitywould be willing to run a relatively non-intrusive tool against their production system than would be willing to set up a full-blown performance test harness. The more information about the greater variety of systems and application designs we collect, the more accurate our cost model is. Therefore we want a tool which can be used by as many people as possible, which means that production systems need to be an option. > I didn't want to broach the subject of the programming language > for this at the early conceptual stages, but if we're talking about > code sharing, it can't wait too long, so I'll jump in with it now. I was > considering using python to program the tool I was discussing. I see nothing wrong with using Python. > If > python is used, I don't care whether there is any change to EXPLAIN > ANALYZE -- it only takes a few lines of code to pull out what I need > in the current form. Hmmm ... I think you're collecting less data that I would consider necessary for full analysis of complex queries. Could you give an example? > My concern is whether python is supported on > all of the target platforms. I think python support is broad enough that using it is not an inhibitor. We're not talking Haskell, after all. > I think I will be significantly more > productive at this in python than if I used C or perl, but if it's not > accessible to the PostgreSQL community as a whole, I'll cope. > Comments, anyone? There are actually lots of Python people in the commmunity, and Python is easy to learn/read if you are experienced with C, Perl, or Ruby. So I have no objections. > I need to have somewhere for the work to live, and I quite frankly > would just as soon dodge the overhead of setting up and maintaining > something, so if noone has objections or other suggestions, I'm > inclined to take you up on your offer to use your testperf project. > Does anyone think some other location would be more appropriate? More appropriate than pgFoundry? I can't imagine. You'll need to register as a user on pgFoundry, and send me your user name. > If we get into much more detail, I assume we should take this > off-list. Well, once you get going we'll use the testperf-general list, which doesn't get much traffic these days. pgFoundry also supports bug tracking, task management, and document sharing, you should check it out. --Josh -- Josh Berkus Aglio Database Solutions San Francisco
pgsql-hackers by date: