pg_stat_statements fingerprinting logic and ArrayExpr - Mailing list pgsql-hackers

From Peter Geoghegan
Subject pg_stat_statements fingerprinting logic and ArrayExpr
Date
Msg-id CAM3SWZRQ5KY1iu+kQO7U3L-Un=O-8fhzWw48VDXm7YF4ktZLWw@mail.gmail.com
Whole thread Raw
Responses Re: pg_stat_statements fingerprinting logic and ArrayExpr
List pgsql-hackers
pg_stat_statements' fingerprinting logic considers the following two
statements as distinct:

select 1 in (1, 2, 3);
select 1 in (1, 2, 3, 4);

This is because the ArrayExpr jumble case jumbles any ArrayExpr's list
of elements recursively. In this case it's a list of Const nodes, and
the fingerprinting logic jumbles those nodes indifferently.

Somebody told me that they think that pg_stat_statements should not do
that. This person felt that it would be preferable for such
expressions to be normalized without regard to the number of distinct
Const elements. I suppose that that would work by determing if the
ArrayExpr elements list was a list of Const nodes and only const
nodes. Iff that turned out to be the case, something else would be
jumbled (something other than the list) that would essentially be a
representation of "some list of zero or more (or maybe one or more)
Const nodes with consttype of, in this example, 23". I think that this
would make at least one person happy, because of course the two
statements above would have their costs aggregated within a single
pg_stat_statements entry.

I'm not sure that I agree, but there is anecdata that suggests that it
isn't uncommon for these sorts of queries to be broken out when
they're all traceable back to a single point in the application
(apparently it's common for Django apps to do so, perhaps
questionably). If we assume that doing what I've described has no real
downside, then it would probably be worth implementing. Plus I'm
pretty sure that tools that do regex normalization are already doing
something analogous. Thoughts?

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Shigeru Hanada
Date:
Subject: Re: Custom Scan APIs (Re: Custom Plan node)
Next
From: Andres Freund
Date:
Subject: Re: Time-Delayed Standbys