Re: [HACKERS] BUG: pg_stat_statements query normalization issues withcombined queries - Mailing list pgsql-hackers
From | Craig Ringer |
---|---|
Subject | Re: [HACKERS] BUG: pg_stat_statements query normalization issues withcombined queries |
Date | |
Msg-id | CAMsr+YF+ZJypAzT3K+aNStOhtv875gb2nk2z1w2PJNscGe1tgQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] BUG: pg_stat_statements query normalization issueswith combined queries (Fabien COELHO <coelho@cri.ensmp.fr>) |
Responses |
Re: [HACKERS] BUG: pg_stat_statements query normalization issueswith combined queries
|
List | pgsql-hackers |
On 28 December 2016 at 20:11, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Tom, > >> [...] It's touching every single utility statement type, which is not only >> pretty invasive in itself, but will break any pending patches that add more >> utility statement types. > > > Yep, but it is limited to headers and the break is trivial... I'm not sure how else that part can be done. We can add a common parent for all utility statement types, but that also touches all utility statements, and since we're working with pure C, it means 'type' won't be accessible as e.g. someCreateStmt.type, but only as something like ((UtilityStatement)someCreateStmt).type or someCreateStmt.hdr.type . e.g. typedef struct UtilityStatement { NodeTag type; int querytext_offset; int querytext_length; }; typedef struct CreateStmt { UtilityStatement hdr; RangeVar *relation; /* relation to create */ .... } I don't think that's so bad, but it's not exactly less invasive. >> And you've not followed through on the implications of adding those fields >> in, for instance, src/backend/nodes/; > > Hmmm, probably you are pointing out structure read/write functions. I'm sure that's his intent, yes. > I would have hoped that such code would be automatically derived from the > corresponding struct definition... That would be nice. But no, unfortunately, we have no such preprocessing. To do so would require that we either generate things like parsenodes.h and the relevant parts of copyfuncs.c etc from a common source, or have a toolchain that can ingest parsenodes.h and emit suitable code. The latter means either "fragile, hideous Perl script" or requiring something like LLVM or the gcc frontend libraries in the toolchain to parse the C code and emit an intermediate representation of the structures that we could then generate our functions from. If you've come from the Java world you'd expect that we'd just generate the copy and makenode stuff from introspection of the node structs, or preprocess them or something, but it's not very practical. They don't change enough to really make it worth it either. > Hmmm. I've run into a tiny small ever so little detail in trying to apply > this clever approach... > > For fixing the information in pg_stat_statement, the location data must be > transported from the parsed node to the query to the planned node, because > the later two nodes types are passed to different hooks. Yep. > Now the detail is that utility statements, which seems to be nearly all of > them but select/update/delete/insert, do not have plans: The statement > itself is its own plan... so there is no place to store the location & > length. Yeah. I don't see any way around adding location info for utility statements one way or the other. We could wrap every utility statement up in a container with the location info, but that'd break every user of ProcessUtility_hook and unless we put the container as the first by-value member of each utility struct anyway, it'd mean extra allocations for each utility statement. Not very appealing. TBH I think that for utility statements, adding a member struct with location info is the least-bad option. Something like: /** Location of the original querytext of an individual parsetree or plan relative to the* start of the querytext. This is usually offset 0 and the length of the querytext its self,* but in multi-statements and other cases where a single parse has multiple products* hooks may need to know which part of the querytext corresponds to which product* plan.** offset and length are both -1 for plans with no corresponding querytext, such as* additional/replacement queries emitted by the rewrite phase or additional statements* produced as a result of processing some utility statements.* typedef struct StmtLocation { int offset; int length; } typedef struct CreateStmt { NodeTag type; StmtLocation location; .... } Personally I don't much care if we do it this way, or by embedding the NodeTag too and creating a common UtilityStmt. The latter probably has benefits for later extension, but is a bit more annoying in the shorter term. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Remote DBA, Training &Services
pgsql-hackers by date: