Thread: Auto explain target tables

Auto explain target tables

From
Миша Тюрин
Date:
hi, list, again. the next proposal into auto explain. one would be happy if could set list of target tables and
indexes.sometimes it is very hard to detect who is using your indexes. but turn total logging on under thousands
transactionsper seconds is not seems like nice idea couse size of resulting log files (cpu utilization might not be so
critical)

Re: Auto explain target tables

From
Craig Ringer
Date:
On 07/21/2013 10:42 PM, Миша Тюрин wrote:
> 
> hi, list, again. the next proposal into auto explain. one would be happy if could set list of target tables and
indexes.sometimes it is very hard to detect who is using your indexes. but turn total logging on under thousands
transactionsper seconds is not seems like nice idea couse size of resulting log files (cpu utilization might not be so
critical)

That sounds like a good idea - and since auto_explain is a contrib
module, it could potentially be a way to get into PostgreSQL
development. The code is in contrib/auto_explain/auto_explain.c .

The challenge here is that planned queries don't just touch one table or
index. You'd need to walk the query plan (a graph of Node structures) to
determine which table(s) and index(es) are touched by the query there's
something in the sources for it already (I haven't checked).

You'll also need a way to supply the list of tables/indexes you are
interested in to the extension. The simplest way to start with is likely
to be reading a separate file from the data dir that contains one
relation name per line. Integrating it into the postgresql.conf GUC
machinery is harder; you'll need a way to parse a list of tables from a
GUC (maybe you can re-use the search_path code for this?) or some other
way to handle your need for a multi-valued setting.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Auto explain target tables

From
Craig Ringer
Date:
On 07/23/2013 11:24 AM, Craig Ringer wrote:
> On 07/21/2013 10:42 PM, Миша Тюрин wrote:
>>
>> hi, list, again. the next proposal into auto explain. one would be happy if could set list of target tables and
indexes.sometimes it is very hard to detect who is using your indexes. but turn total logging on under thousands
transactionsper seconds is not seems like nice idea couse size of resulting log files (cpu utilization might not be so
critical)
> 
> That sounds like a good idea - and since auto_explain is a contrib
> module, it could potentially be a way to get into PostgreSQL
> development. The code is in contrib/auto_explain/auto_explain.c .
> 
> The challenge here is that planned queries don't just touch one table or
> index. You'd need to walk the query plan (a graph of Node structures) to
> determine which table(s) and index(es) are touched by the query there's
> something in the sources for it already (I haven't checked).

Showing that I'm still very much learning this area myself, a bit more
looking around found:

http://www.postgresql.org/docs/current/static/querytree.html

which makes it clear that the range table for the query will contain
what you want. I suspect you'll need to find CTEs and subqueries and
extract the relations they touch; I haven't yet checked to see whether
they're aggregated into the top level range table, but suspect not.

You'll find the code in src/backend/nodes/print.c informative and
useful, though not directly reusable.

Reading src/include/nodes/primnodes.h would be a good idea; it shows how
subqueries, etc, are represented.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Auto explain target tables

From
Craig Ringer
Date:
On 07/23/2013 11:41 AM, Craig Ringer wrote:
> On 07/23/2013 11:24 AM, Craig Ringer wrote:
>> On 07/21/2013 10:42 PM, Миша Тюрин wrote:
>>>
>>> hi, list, again. the next proposal into auto explain. one would be happy if could set list of target tables and
indexes.sometimes it is very hard to detect who is using your indexes. but turn total logging on under thousands
transactionsper seconds is not seems like nice idea couse size of resulting log files (cpu utilization might not be so
critical)
>>
>> That sounds like a good idea - and since auto_explain is a contrib
>> module, it could potentially be a way to get into PostgreSQL
>> development. The code is in contrib/auto_explain/auto_explain.c .
>>
>> The challenge here is that planned queries don't just touch one table or
>> index. You'd need to walk the query plan (a graph of Node structures) to
>> determine which table(s) and index(es) are touched by the query there's
>> something in the sources for it already (I haven't checked).
> 
> Showing that I'm still very much learning this area myself, a bit more
> looking around found:
> 
> http://www.postgresql.org/docs/current/static/querytree.html
> 
> which makes it clear that the range table for the query will contain
> what you want. I suspect you'll need to find CTEs and subqueries and
> extract the relations they touch; I haven't yet checked to see whether
> they're aggregated into the top level range table, but suspect not.

Of course, having said that I was then curious enough to go digging.
It's trivial to dump the parse tree or query plan tree, as documented in
the above linked documentation, and doing so clearly shows that
subqueries and CTEs have their own range tables.

So you'll have to walk the node tree looking for range tables and check
to see whether any of the range table entries match one of the tables
you're looking for.

It's possible that this is a completely backwards approach; if so,
hopefully one of the more experienced people here will correct me with
the simple and obvious way to do it.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Auto explain target tables

From
Tom Lane
Date:
Craig Ringer <craig@2ndquadrant.com> writes:
>> Showing that I'm still very much learning this area myself, a bit more
>> looking around found:
>> http://www.postgresql.org/docs/current/static/querytree.html
>> which makes it clear that the range table for the query will contain
>> what you want. I suspect you'll need to find CTEs and subqueries and
>> extract the relations they touch; I haven't yet checked to see whether
>> they're aggregated into the top level range table, but suspect not.

> Of course, having said that I was then curious enough to go digging.
> It's trivial to dump the parse tree or query plan tree, as documented in
> the above linked documentation, and doing so clearly shows that
> subqueries and CTEs have their own range tables.

> So you'll have to walk the node tree looking for range tables and check
> to see whether any of the range table entries match one of the tables
> you're looking for.

> It's possible that this is a completely backwards approach; if so,
> hopefully one of the more experienced people here will correct me with
> the simple and obvious way to do it.

That's correct if you're looking at the parser-output representation.
However, a plan has a "flat" rangetable, so if you're looking at a plan
rather than a raw Query it's much easier.  I'm too lazy to go check
right now, but I think in auto_explain it would be at least as easy to
look at the plan.

There are some definitional issues that might be worth considering:
1. should a view used in a query be a match?  (probably)
2. how about a table that's accessed via a view?  (almost certainly)
3. should an inheritance child table that's excluded by constraint
exclusion, and thus not actually scanned, be a match?  (debatable)

If you believe you want #2 then you don't want to look at raw Query
trees, because view expansion hasn't been done so you won't see anything
hidden under a view.

If you just blindly scan the flat rangetable of a plan then you will get
hits for all three cases.  The only way to exclude #3 would be to scan
the plan tree and identify which rangetable entries are actually scanned
by scan nodes.  However, if you filter in that way then you'll also
exclude #1 (since a view isn't referenced in the plan tree).  You could
have #1 and not #3 if you used some complicated ie fragile hybrid method,
but I'm doubting it's worth it.
        regards, tom lane



Re: Auto explain target tables

From
Craig Ringer
Date:
On 07/23/2013 12:52 PM, Tom Lane wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>>> Showing that I'm still very much learning this area myself, a bit more
>>> looking around found:
>>> http://www.postgresql.org/docs/current/static/querytree.html
>>> which makes it clear that the range table for the query will contain
>>> what you want. I suspect you'll need to find CTEs and subqueries and
>>> extract the relations they touch; I haven't yet checked to see whether
>>> they're aggregated into the top level range table, but suspect not.
> 
>> Of course, having said that I was then curious enough to go digging.
>> It's trivial to dump the parse tree or query plan tree, as documented in
>> the above linked documentation, and doing so clearly shows that
>> subqueries and CTEs have their own range tables.
> 
>> So you'll have to walk the node tree looking for range tables and check
>> to see whether any of the range table entries match one of the tables
>> you're looking for.
> 
>> It's possible that this is a completely backwards approach; if so,
>> hopefully one of the more experienced people here will correct me with
>> the simple and obvious way to do it.
> 
> That's correct if you're looking at the parser-output representation.
> However, a plan has a "flat" rangetable, so if you're looking at a plan
> rather than a raw Query it's much easier.  I'm too lazy to go check
> right now, but I think in auto_explain it would be at least as easy to
> look at the plan.

Thanks for that Tom, I appreciate your taking the time to clarify the
details there.

I should've seen that in the debug plan output:

STATEMENT:  WITH fred AS (SELECT * FROM foo)       SELECT * FROM fred;
LOG:  plan:
DETAIL:     {PLANNEDSTMT          ....          :rtable (             {RTE             :alias <>             :eref
         {ALIAS                :aliasname fred                :colnames ("stuff")                }              ...
       }             {RTE             :alias <>             :eref                {ALIAS                :aliasname foo
            :colnames ("stuff")                }             ....             }          )          :resultRelations <>
        ....
 

STATEMENT:  WITH fred AS (SELECT * FROM foo)       SELECT * FROM fred;



It's certainly as easy, if not easier, to look at the plan in
auto_explain, so that seems to be the way to go.

auto_explain gets a QueryDesc* in its executor hook, which includes a
PlannedStmt* with a List* of RangeTblEntry nodes.

Scanning the range table for relations of interest looks pretty
reasonable so long as you don't want to do anything too tricky.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services