On Wed, Nov 25, 2015 at 9:13 AM, Lukas Fittl <lukas@fittl.com> wrote:
On Mon, Nov 23, 2015 at 11:53 PM, Peter Geoghegan <pg@heroku.com> wrote:
One specific justification he gave for not using pg_stat_statements was:
"Doesn’t merge bind vars in IN()" (See slide #11)
I wonder:
* How do other people feel about this? Personally, I've seen enough problems of this kind in the field that "slippery slope" arguments against this don't seem very compelling.
As someone who runs a little monitoring service thats solely based on pg_stat_statements data, ignoring IN list length would certainly be a good change.
We currently do this in post-processing, together with other data cleanup (e.g. ignoring the length of a VALUES list in an INSERT statement).
Given the fact that pgss data is normalized & you don't know which plan was chosen, I don't think distinguishing based on the length of the list helps anyone really.
I see pg_stat_statements as a high-level overview of which queries have run, and which ones you might want to look into closer using e.g. auto_explain.
I still have the plans to try to marry pg_stat_statements and auto_explain for the next iteration of "online query plans" extension I was proposing a few months ago, and the first thing I was going to look into is rectifying this problem with IN() jumbling. So, have a +1 from me.