Peter Geoghegan <pg@heroku.com> writes:
> On Tue, Dec 10, 2013 at 1:30 AM, Peter Geoghegan <pg@heroku.com> wrote:
>> pg_stat_statements' fingerprinting logic considers the following two
>> statements as distinct:
>>
>> select 1 in (1, 2, 3);
>> select 1 in (1, 2, 3, 4);
>>
>> This is because the ArrayExpr jumble case jumbles any ArrayExpr's list
>> of elements recursively. In this case it's a list of Const nodes, and
>> the fingerprinting logic jumbles those nodes indifferently.
I think this is a vastly oversimplified explanation of the problem.
In particular, because the planner will flatten an ArrayExpr containing
only Const nodes to an array constant (see eval_const_expressions),
I don't believe the case ever arises in exactly the form you posit here.
A portion of the problem is possibly due to the heuristics in
parse_expr.c's transformAExprIn():
* We try to generate a ScalarArrayOpExpr from IN/NOT IN, but this is only * possible if there is a suitable
arraytype available. If not, we fall * back to a boolean condition tree with multiple copies of the lefthand *
expression. Also, any IN-list items that contain Vars are handled as * separate boolean conditions, because that
givesthe planner more scope * for optimization on such clauses.
If the original text actually involves a variable number of Vars, then you
will end up with a boolean expression with a varying number of OR arms,
even if the Vars later get flattened to constants. However, it's not
clear to me that anyone would expect such cases to be treated as
identical. Another possibility is a type clash, for example
"x IN (42, 44.1)" will end up as a boolean tree for lack of a common
type for the would-be array elements. That case might possibly be an
issue in practice.
But what seems more likely to be annoying people is cases in which the
original text contains a varying number of Param markers. Those might or
might not get folded to constants during planning depending on context,
so that they might or might not look different to pg_stat_statements.
So I suspect the real problem here is that we might want all of these
things to look identical to pg_stat_statements:
ARRAY[$1, $2, 42] ARRAY[$1, $2, $3, 47] '{1,2,3,47}'::int[]
Don't see a very clean way to do that ...
regards, tom lane