Improve handling of pg_stat_statements handling of bind "IN" variables - Mailing list pgsql-hackers
From | Pavel Trukhanov |
---|---|
Subject | Improve handling of pg_stat_statements handling of bind "IN" variables |
Date | |
Msg-id | CAF42k=JCfHMJtkAVXCzBn2XBxDC83xb4VhV7jU7enPnZ0CfEQQ@mail.gmail.com Whole thread Raw |
Responses |
Re: Improve handling of pg_stat_statements handling of bind "IN" variables
|
List | pgsql-hackers |
Hi Hackers I would like to embark on a journey to try to implement this issue I found on TODO list – https://www.postgresql.org/message-id/flat/CAM3SWZSpdPB3uErnXWMt3q74y0r%2B84ZsOt2U3HKKes_V7O%2B0Qg%40mail.gmail.com In short: pgss distinguishes "SELECT * WHERE id IN (1, 2)" and "SELECT * WHERE id IN (1, 2, 3)" as two separate queryId's, resulting in separate entries in pgss. While in practice in most cases it should be considered as the same thing. Though it was added in TODO by Bruce Momjian some time ago, I personally have been annoyed by this issue, because we use pgss as a data source in our monitoring system okmeter.io – so we've been using some work arounds for this in our system. The way AFAIU it is suggested to be handled in the previous thread is to not jumble ArrayExpr recursively and just treat it as "some list of zero or more nodes". I have already lurked around related code, but I have stumbled on some problems with the way I see I can implement this. So I want to ask for advice and maybe even guidance because I'm new to PG internals and not a regular in C coding. 1. ArrayExpr ArrayExpr is used to represent not only "IN" clauses, but also for example "SELECT ARRAY[1, 2, 3]" and maybe some other cases I didn't think of. That brings the question whether "IN (...)" should be handled separately from actual usage of ARRAY. Or it is okay for any ARRAY to be jumbled w/o respect to number of entries in it? With that, "SELECT ARRAY[1, 2]" becomes undistinguishable from "SELECT ARRAY[1, 2, 3]" etc in pgss. I'm asking this because I'm not sure if it would be okay to handle both cases in the same way. For example "SELECT ARRAY[1, 2, a] FROM table" and "SELECT ARRAY[b] FROM table" might end up in the same pgss entry. While a separate handling for "IN (...)" seems to require lots of changes – starting from parser (new parser node type) and further. How should I proceed? 2 Weird arrays - with Consts and Params or const expressions or different types etc SELECT * FROM test WHERE a IN (1, $1) SELECT * FROM test WHERE a IN (1, 3+1) SELECT * FROM test WHERE a IN (1, 2.1) SELECT * FROM test WHERE a IN (1.1, 2.1) etc. How those should be handled? Should those be indistinguishable from "IN ($1, $2, $3)" as well? Or such non realistic usage examples are negligible and no one cares what happens with them? 3 Tests in pgss.sql/out and Vars I would like someone to point me in a direction of how could I implement a test that will query "SELECT * FROM test WHERE a IN ($1, $2, $3, ...)" with params, not consts, because I think this is the most common case actually. And existing tests only check consts in "IN" list. I don't see a way to implement such a test with the existing test infrastructure. Though if that might be considered out of the scope for this TODO, it would be okay with me. I would appreciate any feedback. --- Pavel Trukhanov
pgsql-hackers by date: