Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Date
Msg-id CAEYLb_UgpcBCOT=HV__74FHrG8YcAJakLUcBa+m55H-sXxYC4A@mail.gmail.com
Whole thread Raw
In response to Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)  (Daniel Farina <daniel@heroku.com>)
List pgsql-hackers
On 1 March 2012 00:48, Alvaro Herrera <alvherre@commandprompt.com> wrote:
> I'm curious about the LeafNode stuff.  Is this something that could be
> done by expression_tree_walker?  I'm not completely familiar with it so
> maybe there's some showstopper such as some node tags not being
> supported, or maybe it just doesn't help.  But I'm curious.

Good question. The LeafNode function (which is a bit of a misnomer, as
noted in a comment) looks rather like a walker function, as appears in
the example in a comment in nodeFuncs.c:

* expression_tree_walker() is designed to support routines that traverse* a tree in a read-only fashion (although it
willalso work for routines* that modify nodes in-place but never add/delete/replace nodes).* A walker routine should
looklike this:** bool my_walker (Node *node, my_struct *context)* {*        if (node == NULL)*            return
false;*       // check for nodes that special work is required for, eg:*        if (IsA(node, Var))*        {*
 ... do special actions for Var nodes*        }*        else if (IsA(node, ...))*        {*            ... do special
actionsfor other node types*        }*        // for any node type not specially processed, do:*        return
expression_tree_walker(node,my_walker, (void *) context);* } 

My understanding is that the expression-tree walking support is mostly
useful for the majority of walker code, which only cares about a small
subset of nodes, and hopes to avoid including boilerplate code just to
walk those other nodes that it's actually disinterested in.

This code, unlike most clients of expression_tree_walker(), is pretty
much interested in everything, since its express purpose is to
fingerprint all possible query trees.

Another benefit of expression_tree_walker is that if you miss a
certain node being added, (say a FuncExpr-like node), you get to
automatically have that node walked over to walk to the nodes that you
do in fact care about (such as those within this new nodes args List).
That makes perfect sense in the majority of cases, but here you've
already missed the fields within this new node that FuncExpr itself
lacks, so you're already finger-printing inaccurately. I suppose you
could still at least get the nodetag and still have a warning about
the fingerprinting being inadequate by going down the
expression_tree_walker path, but I'm inclined to wonder if it you
aren't just better of directly walking the tree, if only to encourage
the idea that this code needs to be maintained over time, and to cut
down on the little extra bit of indirection that that imposes.

It's not going to be any sort of burden to maintain it - it currently
stands at a relatively meagre 800 lines of code for everything to do
with tree walking - and the code that will have to be added with new
nodes or refactored along with the existing tree structure is going to
be totally trivial.

All of that said, I wouldn't mind making LeafNode into a walker, if
that approach is judged to be better, and you don't mind documenting
the order in which the tree is walked as deterministic, because the
order now matters in a way apparently not really anticipated by
expression_tree_walker, though that's probably not a problem.

My real concern now is that it's March 1st, and the last commitfest
may end soon. Even though this patch has extensive regression tests,
has been floating around for months, and, I believe, will end up being
a timely and important feature, a committer has yet to step forward to
work towards this patch getting committed. Can someone volunteer,
please? My expectation is that this feature will make life a lot
easier for a lot of Postgres users.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 16-bit page checksums for 9.2
Next
From: Tom Lane
Date:
Subject: Re: Collect frequency statistics for arrays