Re: BASH script for collecting analyze-related info - Mailing list pgsql-performance

From Ken Tanzer
Subject Re: BASH script for collecting analyze-related info
Date
Msg-id CAD3a31WPXL7is2AZRMMM1EhtoJ6WXQPtjBbhZZti1h1N=6B=-Q@mail.gmail.com
Whole thread Raw
In response to BASH script for collecting analyze-related info  (Ken Tanzer <ken.tanzer@gmail.com>)
List pgsql-performance
Hey John, and thanks for the input.

On Wed, Oct 16, 2013 at 11:00 AM, John Melesky <john.melesky@rentrakmail.com> wrote:
On Wed, Oct 16, 2013 at 12:20 AM, Ken Tanzer <ken.tanzer@gmail.com> wrote:
First and foremost (and primarily directed to people who are kind enough to provide help on this list), is a script like this worthwhile?  Will it help get better problem reports and save back-and-forth time of "please post x, y and z?"  If not, I don't see the point of pursuing this.

Given the amount of back-and-forth that's already happened in this thread, it looks like the script meets at least some needs.

Agreed.  It definitely met my immediate need.  But I'm actually seeing the lack of responses from "list helpers" as an indication this might not be of much general interest.  Of course that's ok too. :)


(Quoting myself.)  In an ideal situation, I'd see this being built into postgres, so that you could do something like EXPLAIN [ANALYZE] DESCRIBE ..., and get your descriptions directly as part of the output.  If that is pure fantasy, is there any way to directly identify all the tables, views and indexes that are involved in a query plan?

This approach seems like the only way you could really get from half-assed to Ass 1.0.  I'll just keep it in the pipe dream category unless or until someone else says otherwise.  A script it is, with a possible goal of getting to Ass 0.7.
 
 
If it is worthwhile, what's the best way of going about getting the necessary information?  Parsing the contents of EXPLAIN output that wasn't designed for this purpose seems doable, but clunky at best.  And it doesn't tell you which views are involved.

Well, EXPLAIN can actually output in a couple of more machine-readable formats (XML, JSON, and YAML), which would be a good place to start. I'm not sure what the best way to get view information would be (aside from potentially parsing the query itself), but I'm sure someone has an idea.
If that is pure fantasy, is there any way to directly identify all the tables, views and indexes that are involved in a query plan? 

Tables and indexes you can get from explaining with a machine-readable format, then traversing the tree looking for 'Relation Name' or 'Index Name' attributes.

I'm not seeing how these other output formats would help much.  They don't seem to contain additional information (and still not the views).  Plus, I'd think it would be best to make sure the describes are based on the same explain.  Since the "regular"? format seems to be what is submitted to the mailing list, it seems best to just stick with parsing that.
 
If no to the above, then parsing the analyze output seems the only option.  I was pondering what language a script to do this should be written in, and I'm thinking that writing it in pgpsql might be the best option, since it would be the most portable and presumably available on all systems.  I haven't fully thought that through, so I'm wondering if anyone sees reasons that wouldn't work, or if some other language would be a better or more natural choice.

I'm a proponent of just taking what you have and publishing it. That lets people use it who need it now, and makes it easier for others to improve it. 

I'm not really sure how putting this in on github and linking to it here makes it any easier to access than the version attached in this thread, but here goes.  https://github.com/ktanzer/pg_analyze_info

If there's a need for a cross-platform version, that's more likely to happen if people who use that platform can test that script for you.

I was thinking that a pgpsql version would be inherently more cross-platform, and definitely more available.  A bash script seems to cut out at least most of the Windows users.
 
Notably, publishing what you have now won't in any way prevent you from rewriting it in a plpgsql function and packaging that up on pgxn later.

Yup.

Cheers,
Ken

 
--
AGENCY Software  
A data system that puts you in control
100% Free Software
(253) 245-3801

learn more about AGENCY or
follow the discussion.

pgsql-performance by date:

Previous
From: Gavin Wahl
Date:
Subject: Re: Planner Conceptual Error when Joining a Subquery -- Outer Query Condition not Pulled Into Subquery
Next
From: sparikh
Date:
Subject: Re: Hot Standby performance issue