Re: pg_stat_statements and "IN" conditions - Mailing list pgsql-hackers
From | Álvaro Herrera |
---|---|
Subject | Re: pg_stat_statements and "IN" conditions |
Date | |
Msg-id | 202502121839.vjr3jm6yb35h@alvherre.pgsql Whole thread Raw |
In response to | Re: pg_stat_statements and "IN" conditions (Dmitry Dolgov <9erthalion6@gmail.com>) |
Responses |
Re: pg_stat_statements and "IN" conditions
|
List | pgsql-hackers |
On 2025-Feb-12, Dmitry Dolgov wrote: > I've been experimenting with this today, and while it's easy to > implement, Great. > there is one annoying thing for which I don't have a solution > yet. When generating a normalized version for such merged queries in > pgss we rely on expression location, something like: > > select i from t where i in (a1, a2, a3, ..., aN); > | | > expr loc1 expr loc2 > > We remember loc1 and loc2, then do not copy anything between then into > the normalized query. Now, the expression location is only known up to > the parsing token, without taking into account e.g. parenthesis in more > complex expressions. Which means we don't know exactly where an > expression starts or ends, and it's hard to correctly represent queries > like: > > select i from t where i in (((a1)), ((a2)), ((a3)), ..., ((aN))); > | | > expr loc1 expr loc2 > > The normalized version looks like this: > > select i from t where i in (((...))); > > While it does not affect the actual functionality and is purely > cosmetic, it's quite visible and causes questions. The nastiness level of this seems quite low, compared to what happens to this other example if we didn't handle these easy cases: create table t (a float); select i from t where i in (1, 2); select i from t where i in (1, '2'); select i from t where i in ('1', 2); select i from t where i in ('1', '2'); select i from t where i in (1.0, 1.0); (The point here is that the datatype differs for the constants from the lexer down in each of these cases.) I think it's more important to handle this better than what the posted patch does, than improving the lexing in presence of other lexical elements in the query. With the current patch I get _five_ pg_stat_statements entries from these queries above, where only one of them was able to apply merging of the elements: queryid │ query ──────────────────────┼───────────────────────────────────── -5783267088740508246 │ select i from t where i in ($1, $2) 6446023427308995149 │ select i from t where i in ($1, $2) 3778867339896201523 │ select i from t where i in (...) -8733218180609156532 │ select i from t where i in ($1, $2) -5106509834475638715 │ select i from t where i in ($1, $2) If I understand what you're saying, it's that the extra parenthesis cause the recorded query text be a little uglier (but the queryid still ends up being one and the same for all queries), which seems much less of a problem. I'm okay saying that cases like that can be improved later. (It seems to me that you want to improve the way we pass the lexed string down to pg_stat_statements, and frankly that even seems a different problem altogether.) -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
pgsql-hackers by date: