Re: [HACKERS] BUG: pg_stat_statements query normalization issueswith combined queries - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: [HACKERS] BUG: pg_stat_statements query normalization issueswith combined queries
Date
Msg-id alpine.DEB.2.20.1612272151540.4911@lancre
Whole thread Raw
In response to Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] BUG: pg_stat_statements query normalization issues with combined queries
List pgsql-hackers
Hello Tom,

>> How? The issue is that stmtmulti silently skip some ';' when empty
>> statements are found, [...]
>
> Maybe you should undo that.

I was trying to be minimally invasive in the parser, i.e. not to change 
any rules.

> I've generally found that trying to put optimizations into the grammar 
> is a bad idea; it's better to KISS in gram.y and apply optimizations 
> later on.

I can change rules, but I'm not sure it will be better. It would allow to 
remove the last ';' location in the lexer which Kyotar does not like, but 
it would add some new stretching to compute the length of statements and 
remove the empty statements.

I would say that it would be different, but not necessarily better.

>>> - Make all parse nodes have the following two members at the
>>> beginning. This unifies all parse node from the view of setting
>>> their location. Commenting about this would be better.
>>>
>>> | NodeTag   type;
>>> | int       location;
>
> I do not like this.  You are making what should be a small patch into
> a giant, invasive one.

I would not say that the current patch is giant & invasive, if you 
abstract the two added fields to high-level statements.

> I'd think about adding a single parse node type that corresponds to 
> "stmt" in the grammar and really doesn't do anything but carry the 
> location info for the overall statement.  That info could then be 
> transferred into the resulting Query node.

I'm not sure I understand your suggestion. Currently I have added the 
location & length information to all high-level statements nodes, plus 
query and planned. I think that planned is necessary for utility 
statements.

I understand that you suggest to add a new intermediate structure:
  typedef struct {    NodeTag tag;    int location, length;    Node *stmt;  } ParseNode;

So that raw_parser would return a List<ParseNode> instead of a List<Node>, 
and the statements would be unmodified.

> Problem solved with (I think) very little code touched.

Hmmm...

Then raw_parser() callers have to manage a List<ParseNode> instead of the 
List<Node>, I'm not sure of the size of the propagation to manage the 
added indirection level appropriately: raw_parser is called 4 times (in 
parser, tcop, commands, plpgsql...). The first call in tcop is 
copyObject(), equal(), check_log_statement(), errdetail_execute()...

Probably it can be done, but I'm moderately unthousiastic: it looks like 
starting unwinding a ball of wool, and I'm not sure where it would stop, 
as it means touching at least the 4 uses of raw_parser in 4 distinct part 
of postgres, whereas the current approach did not change anything for 
raw_parser callers, although at the price of modifying all statement 
nodes.

Did I read you correctly?

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: [HACKERS] Hooks
Next
From: Thomas Munro
Date:
Subject: Re: [HACKERS] [sqlsmith] Crash reading pg_stat_activity