Re: pg_stat_statements fingerprinting logic and ArrayExpr - Mailing list pgsql-hackers

From Tom Lane
Subject Re: pg_stat_statements fingerprinting logic and ArrayExpr
Date
Msg-id 12017.1386716938@sss.pgh.pa.us
Whole thread Raw
In response to Re: pg_stat_statements fingerprinting logic and ArrayExpr  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: pg_stat_statements fingerprinting logic and ArrayExpr
Re: pg_stat_statements fingerprinting logic and ArrayExpr
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> Right, but the flip side is that you could collapse things that people
> don't want collapsed.  If you've got lots of query that differ only in
> that some of them say user_id IN (const1, const2) and others say
> user_id IN (const1, const2, const3) and the constants vary a lot, then
> of course this seems attractive.  On the other hand if you have two
> queries and one of them looks like this:

> WHERE status IN ('active') AND user_id = ?

> and the other looks like this:

> WHERE status IN ('inactive', 'deleted') AND user_id = ?

> ...it might actually annoy you to have those two things conflated;
> it's easy to imagine one having much different performance
> characteristics than the other.

Of course, "status = 'active'" and "status = 'deleted'" might have very
different performance characteristics all by themselves, yet we've
already hard-wired a decision that pg_stat_statements will conflate them.
So I don't think the above argument holds a lot of water.

A different point of view is that it's more or less an implementation
artifact that pg_stat_statements doesn't already see the cases as
equivalent; that happens only because it looks at the querytree before
the planner gets around to constant-folding ARRAY[1,2,3] into the single
Const '{1,2,3}'::int[].

So my objection to what Peter is suggesting is not that it's a bad idea
in isolation, but that I don't see where he's going to stop, short of
reinventing every query-normalization behavior that exists in the planner.
If this particular case is worthy of fixing with a hack in the
fingerprinter, aren't there going to be dozens more with just as good
claims?  (Perhaps not, but what's the basis for thinking this is far
worse than any other normalization issue?)

I'm wondering whether this doesn't indicate that we need to rethink where
the fingerprinter has been plugged in.  I'm not sure that somewhere in
the planner, post-constant-folding, would be a better place; but it's
worth thinking about.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Reference to parent query from ANY sublink
Next
From: Noah Misch
Date:
Subject: Re: Dynamic Shared Memory stuff