Re: A costing analysis tool - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: A costing analysis tool
Date
Msg-id 200510191030.09135.josh@agliodbs.com
Whole thread Raw
In response to Re: A costing analysis tool  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
Kevin,

> If we stored the actual queries and the EXPLAIN ANALYZE results (when
> generated) in the database, what would be the purpose of the node_name,
> db_object, and condition_detail columns?  They don't seem like they
> would be useful for statistical analysis, and it seems like the
> information would be more useful in context.  Are these column really
> needed?

Yes.  For example, the only way you're going analyze index costing by type 
of index is if the index name is stored somewhere (db_object) so that it 
can be matched to its characteristics.  For condition_detail, again we 
could determine that (for example) we have costing problems when filters 
involve more than 2 columns or complex expressions.  

Node_name is as actually duplicative of some of the other columns, so I 
suppose it could be dropped.

> For a given node_type, are there mutiple valid condition_type values?
> If so, I need to modify my python script to capture this.  If not, I
> don't see a need to store it.

I'm not sure.  Even if there aren't now, there could be in the future. I'm 
more focused on supporting cross-node-type conditions.  For example, 
"Filter" conditions can apply to a variety of node types (Index Scan, 
Merge Join, Subquery Scan, Seq Scan, aggregates).   If we were costing 
Filters, we'd want to be able to aggregate their stats regardless of the 
node in which they occurred.

I'm also really unclear on why you're so focused on storing less 
information rather than more.  In an "investigation" tool like this, it's 
important to collect as much data as possible because we don't know what's 
going to be valuable until we analyze it.   You seem to be starting out 
with the idea that you *already* know exactly where the problems are 
located, in which case why develop a tool at all?  Just fix the problem.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: A costing analysis tool
Next
From: Devrim GUNDUZ
Date:
Subject: Re: 8.04 and RedHat/CentOS init script issue